Need to merge information from four files together [closed] - python

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a 2 big files, both include segments of information I want, I have extracted the information into output files and now I have 4 files which hold the information I need.
What I want to do is merge the information from the four files into 1 file, in a neat format as long as it is a line by line format including 4 columns and separated by commas, and I want to be able to put something at the top of the file when it opens as to let the user know what information is in the columns. Is this possible in python>?
Here is the info I want to merge:
'/usr/share/doc/HTML/es/kioslave/index.docbook'
Redhat 7.3'
Linux'
D84270022E57F1850C8464FA432ADFF99588157B'
every line is 1 line from the files I have, they go for many lines so I cannot post the whole thing, but that is an example of the info.

The Python zip function is used to combine multiple sources into a single tuple.
for row in zip(file1, file2, file3, file4):
# output the 4 column values in row

It is entirely possible--look at the csv module, which is most likely what you need. It's easy to use.
You'll be creating a comma separated value file (.csv) where the first row will be headers indicating the contents of each row, i.e.
path,distro,os,serial
'/usr/share/doc/HTML/es/kioslave/index.docbook' ,'Redhat 7.3', 'Linux', D84270022E57F1850C8464FA432ADFF99588157B'

Related

Splitting the paragraphs in two Python strings into lines of a maximum width [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 months ago.
Improve this question
I have two strings in a Python script which each contain single lines of text, blank lines and multiple paragraphs. Some of the paragraphs in the strings are very long so I would like to split them into multiple lines of text so that each line in the paragraphs is a certain maximum width. I would then like to split each string into lines so that the strings may be compared using the HtmlDiff class in the difflib module. Might someone know a quick and easy way to do this? I would greatly appreciate it. Thanks so much.
By searching, I found the following link:
How to modify list entries during for loop?
Using the information in the first answer, and the first comment to this question, I was able to achieve what I was looking for using code as the following below:
firstListOfLines = firstText.splitlines()
for index, line in enumerate(firstListOfLines):
firstListOfLines[index] = textwrap.fill(line)
firstListOfLines = '\n'.join(firstListOfLines).splitlines()
secondListOfLines = secondText.splitlines()
for index, line in enumerate(secondListOfLines):
secondListOfLines[index] = textwrap.fill(line)
secondListOfLines = '\n'.join(secondListOfLines).splitlines()
Thanks so much. The first comment helped me to think about what to do. Thanks again.

A folder contains files of dog breeds like `Boston_terrier_02303.jpg`. I want to remove the numeric parts and also the `_`. How do I achieve this? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
My folder contains .jpg files in folder. I need to fetch only the characters from the file names.
I removed all the non alphabets but it resulted in a single string without spaces
Input: Boston_terrier_02303.jpg
Desired Output: Boston terrier
Assuming that you always have the same structure (n word fragments, 1 number, and the output), you can simply get your desired result by:
new_string = " ".join(string.split("_")[:-1])
To elaborate:
You start by splitting the strings at the underscores, and then selecting everything but the last. Then simply join the remaining strings with a space between them.

Need help importing specific data (variable values) from a text file and ignoring non useful text and metadata [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
here is my attempted code though it may be rubbish and to the right is the data, i just want the two columns of data from line 15 onwards
my code reads:
import numpy as np
import matplotlib as mplt
data = np.genfromtxt('practice_data.txt',
dtype='float',
delimiter=' ')
time = data[:,0]
channel=data[:,1]
if anyone can help me to extract the two columns as two variables that would be amazing
With genfromtxt, you have a parameter which is named : skip_header.
You can also extract the two columns as to variables like this :
data = np.genfromtxt('pratice_data.txt',
dtype=[('first column name','f8'),('second column name','i8')],
delimiter=' ',
skip_header = 14)
skip_header = 14 let to pass the fourtheenth first lines.
I think that with this way, you can passed header and you get two columns that you can call after ;)
I didn't try the script, but it should work !
Good luck ;)

How do I read a text file into a string variable in Python starting at the second line? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I use the following code segment to read a file in python
file = open("test.txt", "rb")
data=file.readlines()[1:]
file.close
print data
However, I need to read the entire file (apart from the first line) as a string into the variable data.
As it is, when my files contents are test test test, my variable contains the list ['testtesttest'].
How do I read the file into a string?
I am using python 2.7 on Windows 7.
The solution is pretty simple. You just need to use a with ... as construct like this, read from lines 2 onward, and then join the returned list into a string. In this particular instance, I'm using "" as a join delimiter, but you can use whatever you like.
with open("/path/to/myfile.txt", "rb") as myfile:
data_to_read = "".join(myfile.readlines()[1:])
...
The advantage of using a with ... as construct is that the file is explicitly closed, and you don't need to call myfile.close().

Parsing a series of fixed-width files [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have a series (~30) of files that are made up of rows like:
xxxxnnxxxxxxxnnnnnnnxxxnn
Where x is a char and n is a number, and each group is a different field.
This is fixed for each file so would be pretty easy to split and read with a struct or slice; however I was wondering if there's an effective way of doing it for a lot of files (with each file having different fields and lengths) without hard-coding it.
One idea I had was creating an XML file with the schema for each file, and then I could dynamically add new ones where required and the code would be more portable, however I wanted to check there are no simpler/more standard ways of doing this.
I will be outputting the data into either Redis or an ORM if this helps, and each file will only be processed once (although other files with different structures will be added at later dates).
Thanks
You could use itertools.groupby, with str.isdigit for instance (or isalpha):
>>> line = "aaa111bbb22cccc345defgh67"
>>> [''.join(i[1]) for i in itertools.groupby(line,str.isdigit)]
['aaa', '111', 'bbb', '22', 'cccc', '345', 'defgh', '67']
I think #fredtantini's answer contains a good suggestion — and here's a fleshed out way of applying it to your problem coupled with a minor variation of the code in my answer to a related question titled Efficient way of parsing fixed width files in Python:
from itertools import groupby
from struct import Struct
isdigit = str.isdigit
def parse_fields(filename):
with open(filename) as file:
# determine the layout of fields from the first line of the file
firstline = file.readline().rstrip()
fieldwidths = (len(''.join(i[1])) for i in groupby(firstline, isdigit))
fmtstring = ''.join('{}s'.format(fw) for fw in fieldwidths)
parse = Struct(fmtstring).unpack_from
file.seek(0) # rewind
for line in file:
yield parse(line)
for row in parse_fields('somefile.txt'):
print(row)

Categories

Resources