Whatsapp Chat Anayzer with Python, what next?

Whatsapp Chat Anayzer with Python, what next? - python

im pretty new to python, and since learning information by just writing over the code the tutorial guy tells me I figured it would be better for me to actually build something, so I decided on a whatsapp chat analyzer.
I only got so far using google and now im stuck again. For the reference I use this website, which tells you how to make the chat analyzer but does not actually give you any code.
This is what I managed to do up until now, and that is, reading and printing out the .txt file.
f = open(r"chat.txt","r+", encoding='utf-8')
file_contents = f.read()
print(file_contents)
That just outputs the entire .txt file of chats.
Next, website says I should count the total number of messages and total number of words.
It suggest doing something aloing these lines:
Strings are treated as lists. So you can do a search like this:
if "- Paridhi:" in chat_line:
counter+=1

You need to first define counter=0
Just like this:
splt=file_contents.split()
print(splt)
counter=0
if 'file' in splt:
counter=counter+1
print(counter)

Related

How to scan a file for keywords?

So I was bored and my friend suggest I should code an anti cheat because I couldn't come up with what to code myself. It's supposed to dump the history of the web browser then search it through using keywords. So when a user is under investigation instead of them having to screenshare and click through every file and manually check the browser history this would automate that task. Anyways here's the code I've made so far.
from browser_history import get_history
output = get_history()
his = output.histories
outputfile = open("demo.txt", "w")
print(his, file= outputfile)
outputfile.close()
with open('demo.txt') as f:
if "hack" in f.read():
print("True")
It works but I also want it to read keywords out of a file and then print those keywords if they have been found. So for example if the user has searched for example "minecraft cheat" or something like that then it would print that it has found a search for "minecraft cheat".
I'm sorry if its a dumb question but I have spent quite a while looking and I can't really find any good tutorial on it. Also I was just doing some testing now and for some reason it doesnt print any of the history from today only yesterday. So if anyone knows of any good way to get the history I'd love to hear suggestions on how to improve the code.

You just need to make a small change in how you read from the file:
from browser_history import get_history
output = get_history()
his = output.histories
with open("demo.txt", "w") as f:
f.write(his)
with open("demo.txt", "r") as f:
for line in f:
if "hack" in line:
print("True")
But since 'his' is already a list, you could read directly from it instead of storing it in a file first, but it's up to you!

I never heard of the browser-history library before, pretty neat idea and you could have a lot of fun with this project making gradual improvements to your program. Enjoy!
A small addition to the above answer as I think you were suggesting you wanted to search for multiple keywords and print the keyword found rather than just "True". You could iterate over a list of keywords for each line as follows:
from browser_history import get_history
output = get_history()
his = output.histories
outputfile = open("demo.txt", "w")
print(his, file=outputfile)
outputfile.close()
keywords = ['hack', 'minecraft cheat', 'i love cheating']
with open("demo.txt", "r") as f:
for line in f:
for keyword in keywords:
if keyword in line:
print(keyword, "found")
It is important that the keyword loop is inside the line loop here because there are lots of lines so you don't want to iterate over them several times.
By the way I had a look at the browser-history documentation but couldn't work out why it doesn't return all history. For me it is returning back as far as 27th December but my actual browser history goes back much further. Good luck getting to the bottom of that.

python3's "pdfkit"-import keeps destroying german umlaute

Hey people and codingfriends.
Im currently writing a little tool in python3 which i want to write some userinput from a .txt-file into a pdf-file. For this i am using "pdfkit". Im first writing my input into a 2nd. txt file, because i need the output in a special form and with "pdfkit.from_string(source, destination.pdf)" it doesnt seem to work properly. So im Writing it first into a 2nd txt file how i already said and then converting it with "pdfkit.from_file("source.txt", destination.pdf)". I already sorted out the problem with the different unicodestuff but somehow the "pdfcreator" keeps deleting/changing all "umlaute". I guess i can figure out why it keeps doing that, but i cant cant find a solution due its a 3rd party import. The createt txt file btw is in exactly the form i need it, just the pdf stuff is not working.
In example:
Does anyone have a solution for me?
For me would also work:
writing from python into a different txt format which is easy to convert into pdf (with umlaute!)
OR
using a 3rd party-software which i am calling from python to convert my txt file
So far i got this code(including some readLines methods which are also working pretty well - i guess):
def dok2pdf(fileName):
dokument=open(fileName, "r", encoding="latin-1").read()
for line in dokument:
if line==" ":
dokument=dokument.replace(line, " ")
if line==";":
dokument=dokument.replace(line, "\n")
file = open("tryhard.txt","w")
file.write(dokument)
file.close()
pdfkit.from_file("tryhard.txt", 'tryhard.pdf')
thanks a lot.

Why can't I see any result? Sub program

I'm making a program to find out how many times something appears in a file (which will be read by Python) - my first step (just to make sure wha I'm doing is correct) is to see the whole file opened in Python.. if you get me?
def input_function():
my_file = open("ChoralShieldData.csv", "r")
ChoralShieldData = []
for each_line in my_file:
ChoralShieldData.append(each_line.split())
return ChoralShieldData
#Main program
ChoralShieldData = input_function()
Thank you in advance

Neither you print the result nor you write the result in your file. That's why you can't see any result.

You haven't told the code to show you anything. At the moment all it is doing is storing the data in your variable ChoralShieldData.
Try adding this to the very bottom of your code (pick one only):
print ChoralShieldData #Python 2.x
print(ChoralShieldData) #Python 3.x
Alternatively, add a similar print statement anywhere in your function to see what it is doing at specific stages.
See here for some more help on reading from and writing to files.

Reading from, and then replacing all the text in a .txt file

I'm very new to Python (and coding in general, if I'm honest) and decided to learn by dipping into the Twitter API to make a weird Twitterbot that scrambles the words in a tweet and reposts them, _ebooks style.
Anyway, the way I have it currently set up, it pulls the latest tweet and then compares it to a .txt file with the previous tweet. If the tweet and the .txt file match (i.e., not a new tweet), it does nothing. If they don't, it replaces the .txt file with the current tweet, then scrambles and posts it. I feel like there's got to be a better way to do this than what I'm doing. Here's the relevant code:
words = hank[0]['text']
target = open("hank.txt", "r")
if words == "STOP":
print "Sam says stop :'("
return
else:
if words == target.read():
print "Nothing New."
else:
target.close()
target = open("hank.txt", "w")
target.write(words)
target.close()
Obviously, opening as 'r' just to check it against the tweet, closing, and re-opening as 'w' is not very efficient. However, if I open as 'w+' it deletes all the contents of the file when I read it, and if I open it as 'r+', it adds the new tweet either to the beginning or the end of the file (dependent on where I set the pointer, obviously). I am 100% sure I am missing something TOTALLY obvious, but after hours of googling and dredging through Python documentation, I haven't found anything simpler. Any help would be more than welcome haha. :)

with open(filename, "r+") as f:
data = f.read()// Redaing the data
//any comparison of tweets etc..
f.truncate()//here basically it clears the file.
f.seek(0)// setting the pointer
f.write("most recent tweet")// writing to the file
No need to close the file instance, it automatically closes.
Just read python docs on these methods used for a more clear picture.

I suggest you use yield to compare hank.txt and words line by line so that more memory space could be saved, if you are so focused on efficiency.
As for file operation, I don't think there is a better way in overwriting a file. If you are using Linux, maybe 'cat > hank.txt' could be faster. Just a guess.

python working with files as they are written

So I'm trying to create a little script to deal with some logs. I'm just learning python, but know about loops and such in other languages. It seems that I don't understand quite how the loops work in python.
I have a raw log from which I'm trying to isolate just the external IP addresses. An example line:
05/09/2011 17:00:18 192.168.111.26 192.168.111.255 Broadcast packet dropped udp/netbios-ns 0 0 X0 0 0 N/A
And heres the code I have so far:
import os,glob,fileinput,re
def parseips():
f = open("126logs.txt",'rb')
r = open("rawips.txt",'r+',os.O_NONBLOCK)
for line in f:
rf = open("rawips.txt",'r+',os.O_NONBLOCK)
ip = line.split()[3]
res=re.search('192.168.',ip)
if not res:
rf.flush()
for line2 in rf:
if ip not in line2:
r.write(ip+'\n')
print 'else write'
else:
print "no"
f.close()
r.close()
rf.close()
parseips()
I have it parsing out the external ip's just fine. But, thinking like a ninja, I thought how cool would it be to handle dupes? The idea or thought process was that I can check the file that the ips are being written to against the current line for a match, and if there is a match, don't write. But this produces many more times the dupes than before :) I could probably use something else, but I'm liking python and it makes me look busy.
Thanks for any insider info.

DISCLAIMER: Since you are new to python, I am going to try to show off a little, so you can lookup some interesting "python things".
I'm going to print all the IPs to console:
def parseips():
with open("126logs.txt",'r') as f:
for line in f:
ip = line.split()[3]
if ip.startswith('192.168.'):
print "%s\n" %ip,
You might also want to look into:
f = open("126logs.txt",'r')
IPs = [line.split()[3] for line in f if line.split()[3].startswith('192.168.')]
Hope this helps,
Enjoy Python!

Something along the lines of this might do the trick:
import os,glob,fileinput,re
def parseips():
prefix = '192.168.'
#preload partial IPs from existing file.
if os.path.exists('rawips.txt'):
with open('rawips.txt', 'rt') as f:
partial_ips = set([ip[len(prefix):] for ip in f.readlines()])
else:
partial_ips = set()
with open('126logs.txt','rt') as input, with open('rawips.txt', 'at') as output:
for line in input:
ip = line.split()[3]
if ip.startswith(prefix) and not ip[len(prefix):] in partial_ips:
partial_ips.add(ip[len(prefix):])
output.write(ip + '\n')
parseips()

Rather than looping through the file you're writing, you might try just using a set. It might consume more memory, but your code will be much nicer, so it's probably worth it unless you run into an actual memory constraint.

Assuming you're just trying to avoid duplicate external IPs, consider creating an additional data structure in order to keep track of which IPs have already been written. Since they're in string format, a dictionary would be good for this.
externalIPDict = {}
#code to detect external IPs goes here- when you get one;
if externalIPString in externalIPDict:
pass # do nothing, you found a dupe
else:
externalIPDict[externalIPDict] = 1
#your code to add the external IP to your file goes here

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Whatsapp Chat Anayzer with Python, what next? - python

You need to first define counter=0 Just like this: splt=file_contents.split() print(splt) counter=0 if 'file' in splt: counter=counter+1 print(counter)

Related

How to scan a file for keywords?

python3's "pdfkit"-import keeps destroying german umlaute

Why can't I see any result? Sub program

Reading from, and then replacing all the text in a .txt file

python working with files as they are written

Categories

Resources