Get different strings from a file and write a .txt

Get different strings from a file and write a .txt - python

I'am trying to get lines from a text file (.log) into a .txt document.
I need get into my .txt file the same data. But the line itself is sometimes different. From what I have seen on internet, it's usualy done with a pattern that will anticipate how the line is made.
1525:22Player 11 spawned with userinfo: \team\b\forcepowers\0-5-030310001013001131\ip\46.98.134.211:24806\rate\25000\snaps\40\cg_predictItems\1\char_color_blue\34\char_color_green\34\char_color_red\34\color1\65507\color2\14942463\color3\2949375\color4\2949375\handicap\100\jp\0\model\desann/default\name\Faybell\pbindicator\1\saber1\saber_malgus_broken\saber2\none\sex\male\ja_guid\420D990471FC7EB6B3EEA94045F739B7\teamoverlay\1
The line i'm working with usualy looks like this. The data i'am trying to collect are :
\ip\0.0.0.0
\name\NickName_of_the_player
\ja_guid\420D990471FC7EB6B3EEA94045F739B7
And print these data, inside a .txt file. Here is my current code.
As explained above, i'am unsure about what keyword to use for my research on google. And how this could be called (Because the string isn't the same?)
I have been looking around alot, and most of the test I have done, have allowed me to do some things, but i'am not yet able to do as explained above. So i'am in hope for guidance here :) (Sorry if i'am noobish, I understand alot how it works, I just didn't learned language in school, I mostly do small scripts, and usualy they work fine, this time it's way harder)
def readLog(filename):
with open(filename,'r') as eventLog:
data = eventLog.read()
dataList = data.splitlines()
return dataList
eventLog = readLog('games.log')

You'll need to read the files in "raw" mode rather than as strings. When reading the file from disk, use open(filename,'rb'). To use your example, I ran
text_input = r"1525:22Player 11 spawned with userinfo: \team\b\forcepowers\0-5-030310001013001131\ip\46.98.134.211:24806\rate\25000\snaps\40\cg_predictItems\1\char_color_blue\34\char_color_green\34\char_color_red\34\color1\65507\color2\14942463\color3\2949375\color4\2949375\handicap\100\jp\0\model\desann/default\name\Faybell\pbindicator\1\saber1\saber_malgus_broken\saber2\none\sex\male\ja_guid\420D990471FC7EB6B3EEA94045F739B7\teamoverlay\1"
text_as_array = text_input.split('\\')
You'll need to know which columns contain the strings you care about. For example,
with open('output.dat','w') as fil:
fil.write(text_as_array[6])
You can figure these array positions from the sample string
>>> text_as_array[6]
'46.98.134.211:24806'
>>> text_as_array[34]
'Faybell'
>>> text_as_array[44]
'420D990471FC7EB6B3EEA94045F739B7'
If the column positions are not consistent but the key-value pairs are always adjacent, we can leverage that
>>> text_as_array.index("ip")
5
>>> text_as_array[text_as_array.index("ip")+1]
'46.98.134.211:24806'

Related

Best way to write rows of a numpy array to file inside, NOT after, a loop?

I'm new here and to python in general, so please forgive any formatting issues and whatever else. I'm a physicist and I have a parametric model, where I want to iterate over one or more of the model's parameter values (possibly in an MCMC setting). But for simplicity, imagine I have just a single parameter with N possible values. In a loop, I compute the model and several scalar metrics pertaining to it.
I want to save the data [parameter value, metric1, metric2, ...] line-by-line to a file. I don't care what type: .pickle, .npz, .txt, .csv or anything else are fine.
I do NOT want to save the array after all N models have been computed. The issue here is that, sometimes a parameter value is so nonphysical that the program I call to calculate the model (which is a giant complicated thing years in development, so I'm not touching it) crashes the kernel. If I have N = 30000 models to do, and this happens at 29000, I'll be very unhappy and have wasted a lot of time. I also probably have to be conscious of memory usage - I've figured out how to do what I propose with a text file, but it crashes around 2600 lines because I don't think it likes opening a text file that long.
So, some pseudo-code:
filename = 'outFile.extension'
dataArray = np.zeros([N,3])
idx = 0
for p in Parameter1:
modelOutputVector = calculateModel(p)
metric1, metric2 = getMetrics(modelOutputVector)
dataArray[idx,0] = p
dataArray[idx,1] = metric1
dataArray[idx,2] = metric2
### Line that saves data here
idx+=1
I'm partial to npz or pickle formats, but can't figure out how to do this with either. If there is a better format or a better solution, I appreciate any advice.
Edit: What I tried to make a text file was this, inside the loop:
fileObject = open(filename, 'ab')
np.savetxt(fileObject, rowOfData, delimiter = ',', newline = ' ')
fileObject.write('\n')
fileObject.close()
The first time it crashed at 2600 or whatever I thought it was just coincidence, but every time I try this, that's where it stops. I could hack it and make a batch of files that are all 2600 lines, but there's got to be a better solution.

Its hard to say with such a limited knowledge of the error, but if you think it is a file writing error maybe you could try something like:
with open(filename, 'ab') as fileObject:
# code that computes numpy array
np.savetxt(fileObject, rowOfData, delimiter = ',', newline = ' ')
fileObject.write('\n')
# no need to .close() because the "with open()" will handle it
However
I have not used np.savetxt()
I am not an expert on your project
I do not even know if it is truly a file writing error to begin with
I just prefer the with open() technique because that's how all the introductory python books I've read structure their file reading/writing processes, so I assume there is wisdom in it. You could also consider doing like fabianegli commented and save to separate files (thats what my work does).

Parsing two files with Python

I'm still new to python and cannot achieve to make what i'm looking for. I'm using Python 3.7.0
I have one file, called log.csv, containing a log of CANbus messages.
I want to check what is the content of column label Data2 and Data3 when the ID is 348 in column label ID.
If they are both different from "00", I want to make a new string called fault_code with the "Data3+Data2".
Then I want to check on another CSV file where this code string appear, and print the column 6 of this row (label description). But this last part I want to do it only one time per fault_code.
Here is my code:
import csv
CAN_ID = "348"
with open('0.csv') as log:
reader = csv.reader(log,delimiter=',')
for log_row in reader:
if log_row[1] == CAN_ID:
if (log_row[5]+log_row[4]) != "0000":
fault_code = log_row[5]+log_row[4]
with open('Fault_codes.csv') as fault:
readerFC = csv.reader(fault,delimiter=';')
for fault_row in readerFC:
if "0x"+fault_code in readerFC:
print("{fault_row[6]}")
Here is a part of the log.csv file
Timestamp,ID,Data0,Data1,Data2,Data3,Data4,Data5,Data6,Data7,
396774,313,0F,00,28,0A,00,00,C2,FF
396774,314,00,00,06,02,10,00,D8,00
396775,**348**,2C,00,**00,00**,FF,7F,E6,02
and this is a part of faultcode.csv
Level;LED Flashes;UID;FID;Type;Display;Message;Description;RecommendedAction
1;2;1;**0x4481**;Warning;F12001;Handbrake Fault;Handbrake is active;Release handbrake
1;5;1;**0x4541**;Warning;F15001;Fan Fault;blablabla;blablalba
1;5;2;**0x4542**;Warning;F15002;blablabla
Also do you think of a better way to do this task? I've read that Pandas can be very good for large files. As log.csv can have 100'000+ row, it's maybe a better idea to use it. What do you think?
Thank you for your help!

Be careful with your indentation, you get this error because you sometimes you use spaces and other tabs to indent.
As PM 2Ring said, reading 'Fault_codes.csv' everytime you read 1 line of your log is really not efficient.
You should read faultcode once and store the content in RAM (if it fits). You can use pandas to do it, and store the content into a DataFrame. I would do that before reading your logs.
You do not need to store all log.csv lines in RAM. So I'd keep reading it line by line with csv module, do my stuff, write to a new file, and read the next line. No need to use pandas here as it will fill your RAM for nothing.

Saving and loading simple data in Python convenient way

I'm currently working on a simple Python 3.4.3 and Tkinter game.
I struggle with saving/reading data now, because I'm a beginner at coding.
What I do now is use .txt files to store my data, but I find this extremely counter-intuitive, as saving/reading more than one line of data requires of me to have additional code to catch any newlines.
Skipping a line would be terrible too.
I've googled it, but I either find .txt save/file options or way too complex ones for saving large-scale data.
I only need to save some strings right now and be able to access them (if possible) by key like in a dictionary key:value .
Do you know of any file format/method to help me accomplish that?
Also: If possible, should work on Win/iOS/Linux.

It sounds like using json would be best for this, which comes as part of the Python Standard library in Python-2.6+
import json
data = {'username':'John', 'health':98, 'weapon':'warhammer'}
# serialize the data to user-data.txt
with open('user-data.txt', 'w') as fobj:
json.dump(data, fobj)
# read the data back in
with open('user-data.txt', 'r') as fobj:
data = json.load(fobj)
print(data)
# outputs:
# {u'username': u'John', u'weapon': u'warhammer', u'health': 98}
A popular alternative is yaml, which is actually a superset of json and produces slightly more human readable results.

You might want to try Redis.
http://redis.io/
I'm not totally sure it'll meet all your needs, but it would probably be better than a flat file.

Building on "How to read and write a table / matrix to file with python?"

Back in Feb 8 '13 at 20:20, YamSMit asked a question (see: How to read and write a table / matrix to file with python?) similar to what I am struggling with: starting out with an Excel table (CSV) that has 3 columns and a varying number of rows. The contents of the columns are string, floating point, and string. The first string will vary in length, while the other string can be fixed (eg, 2 characters). The table needs to go into a 2 dimensional array, so that I can do manipulations on the data to produce a final file (which will be a text file). I have experimented with a variety of strategies presented in stackoverflow, but I am always missing something, and I haven't seen an example with all the parts, which is the reason for the struggle to figure this out.
Sample data will be similar to:
Ray Smith, 41645.87778, V1
I have read and explored numpy and astropy since the available documentation says they make this type of code easy. I have tried import csv. Somehow, the code doesn't come together. I should add that I am writing in Python 3.2.3 (which seems to be a mistake since a lot of documentation is for Python 2.x).
I realize the basic nature of this question directs me to read more tutorials. I have been reading many, yet the tutorials always refer to enough that is different, that I fail to assemble the right pieces: read the table file, write into a 2D array, then... do more stuff.
I am grateful to anyone who might provide me with a workable outline of the code, or pointing me to specific documentation I should read to handle the specific nature of the code I am trying to write.
Many thanks in advance. (Sorry for the wordiness - just trying to be complete.)

I am more familiar with 2.x, but from the 3.3 csv documentation found here, it seems to be mostly the same as 2.x. The following function will read a csv file, and return a 2D array of the rows found in the file.
import csv
def read_csv(file_name):
array_2D = []
with open(file_name, 'rb') as csvfile:
read = csv.reader(csvfile, delimiter=';') #Assuming your csv file has been set up with the ';' delimiter - there are other options, for which you should see the first link.
for row in read:
array_2D.append(row)
return array_2D
You would then be able to manipulate the data as follows (assuming your csv file is called 'foo.csv' and the desired text file is 'foo.txt'):
data = read_csv('foo.csv')
with open('foo.txt') as textwrite:
for row in data:
string = '{0} has {1} apples in his Ford {2}.\n'.format(row[0], row[1], row[2])
textwrite.write(string)
#if you know the second column is a float:
manipulate = float(row[1])*3
textwrite.write(manipulate)
string would then be written to 'foo.txt' as:
Ray Smith has 41645.87778 apples in his Ford V1.\n
and maniuplate would be written to 'foo.txt' as:
124937.63334

Python 3: Extracting Data from a .txt File?

So, I have this file that has data set up like this:
Bob 5 60
Carl 7 80
Rick 8 100
Santiago 7 30
I need to separate each part into three different lists. One for the name, one for the first number, and one for the second number.
But I don't really understand, how exactly do I extract those parts? Also, let's say I want to make a tuple with the first line, with each of the different parts (the name, first number, and second number) into a single tuple?
I just don't get how I extract that information.
I just learned how to read and write text files...so I'm pretty clueless.
EDIT: As a note, the text file already exists. The program I'm working on needs to read the text file, which has its data formatted in the way I listed.

You can split each line on whitespace:
with open(yourfile) as f:
rows = [l.split() for l in f]
names, firstnums, secondnums = zip(*rows)
zip(*iterable) re-arranges the 3 columns into 3 lists.

Would not the module Pickle be ideal here? Pickle gives Python functionality to load and save things that need to be 'useable' in Python, so instead of just importing a string from a text file and having to parse it, pickle can load it and give you the actual container you're trying to work with.
example:
import pickle
myList = ["Bob", 1, 2]
listToBeSaved = pickle.dumps(myList) # write this data to your save file
#insert code where you work with the file and save it
#.........
#upon needing to open and work with this file
listToBeLoaded = open(fileYouWroteTo)
listTranslated = pickle.loads(listToBeLoaded) # turns the loaded data back into a proper list

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.