I have a data file and I want to delete first 3 character of each word in each line
Here is the example of my file:
input
"13X5106,18C2295,17C1462,17X4893,14X4215,16C3729,14C1026,END"
"17C2308,14C1030,15C904,20C1602,17C1017,18C1030,END"
"13C2369,20C1505,18X4245,15C1224,14C1031,12C885,17C936,END"
"11C3080,13C4123,16C1180,14C1141,15C932,18C1467,END"
output
"5106,2295,1462,4893,4215,3729,1026,END"
"2308,1030,904,1602,1017,1030,END"
"2369,1505,4245,1224,1031,885,936,END"
"3080,4123,1180,1141,932,1467,END"
I tried to code but the output is not shown the way I want.
file1 = open('D:\pythonProject\block1.txt','r')
data = file1.read()
remove_char = [sub[3:] for sub in data]
print(remove_char)
If you use file1.readlines(), then you will need to split by comma. The only problem is that it may introduce an end-of-line character at the end. This is because of your END string at the end of each line. But this is easy to get rid of as shown below:
Code:
file1 = open('D:\pythonProject\block1.txt','r')
remove_char = [[s[3:] for s in sub.split(',')] for sub in file1.readlines()]
for the_list in remove_char:
print(the_list[0:-1])
Output:
['5106', '2295', '1462', '4893', '4215', '3729', '1026']
['2308', '1030', '904', '1602', '1017', '1030']
['2369', '1505', '4245', '1224', '1031', '885', '936']
['3080', '4123', '1180', '1141', '932', '1467']
I read the file with f.readlines and got rid of the " on each line.
Then each word is split by , and processed as word[3:].
with open("...", "r") as f:
lines = f.readlines()
lines = map(lambda x: x.replace('"',"").strip("\n").split(","), lines)
res = []
for line in lines:
new_line = []
for word in line:
if word != "END":
word = word[3:]
new_line.append(word)
res.append(",".join(new_line))
res = "\n".join(res)
print(res)
# Output
"""
5106,2295,1462,4893,4215,3729,1026,END
2308,1030,904,1602,1017,1030,END
2369,1505,4245,1224,1031,885,936,END
3080,4123,1180,1141,932,1467,END
"""
You can try this to print each line in for loop:
file1 = open('D:\pythonProject\block1.txt')
data = file1.readlines()
for sub in data:
line = [j[3:] for i in [eval(sub)] for j in i.split(',')[:-1]]+[eval(sub)[-3:]]
remove_char = f'"{chr(44).join(line)}"'
print(remove_char)
Or generator expression:
remove_char = '\n'.join('"'+chr(44).join(j[3:] for i in [eval(s)]
for j in i.split(chr(44))[:-1])+','+chr(44).join([eval(s)[-3:]])+'"'
for s in open('D:\pythonProject\block1.txt').readlines())
print(remove_char)
Output:
"5106,2295,1462,4893,4215,3729,1026,END"
"2308,1030,904,1602,1017,1030,END"
"2369,1505,4245,1224,1031,885,936,END"
"3080,4123,1180,1141,932,1467,END"
Here is a quick solution using a list comprehension :
data = ["13X5106", "18C2295"] # this is a sample list of strings
print([code[3:] for code in data if code != "END"])
This will print the same list with all strings with the first three chars discarded skipping the "END" string:
['5106', '2295']
Hello I have line like below in a file
I want to convert Text :0 to 8978 as a single string. And same for other part i.e Text:1 to 8978.
Text:0
6786993cc89 70hgsksgoop 869368
7897909086h fhsi799hjdkdh 099h
Gsjdh768hhsj dg9978hhjh98 8978
Text:1
8786993cc89 70hgsksgoop 869368
7897909086h fhsi799hjdkdh 099h
Gsjdh768hhsj dg9978hhjh98 8978
I am getting output as
6
7
G
8
7
G
But i want output as from string one and from string two as
6
8
Code is :
file = open ('tem.txt','r')
lines = file.readlines()
print(lines)
for line in lines:
line=line.strip()
linex=line.replace(' ','')
print(linex)
print (linex[0])
I'm not sure about what exact do you need, so:
#1. If need only print the first number (6), I think your code is right.
#2. If you need to print the first part of string(before "space"), it can help you:
line="6786993cc8970hgsksgoop869368 7897909086hfhsi799hjdkdh099h Gsjdh768hhsjdg9978hhjh988978"
print(line[0])
print(line.split(' ')[0])
EDIT
To read a file....
file = open('file.txt', 'r')
Lines = file.readlines()
file.close()
for line in Lines:
print(line.split(' ')[0])
New EDIT
First you need to format your file to after that get the first element. Try this please:
file = open ('tem.txt','r')
lines = file.readlines()
file.close()
linesArray = []
lineTemp = ""
for line in lines:
if 'Text' in line:
if lineTemp:
linesArray.append(lineTemp)
lineTemp = ""
else:
lineTemp += line.strip()
linesArray.append(lineTemp)
for newline in linesArray:
print(newline.split(' ')[0][0])
This should work only if you want to view the first character. Essentially, this code will read your text file, convert multiple lines in the text file to one single string and print out the required first character.
with open(r'tem.txt', 'r') as f:
data = f.readlines()
line = ''.join(data)
print(line[0])
EDITED RESPONSE
Try using regex. Hope this helps.
import re
pattern = re.compile(r'(Text:[0-9]+\s)+')
with open(r'tem.txt', 'r') as f:
data = f.readlines()
data = [i for i in data if len(i.strip())>0]
line = ' '.join([i.strip() for i in data if len(i)>0]).strip()
occurences = re.findall(pattern, line)
for i in occurences:
match_i = re.search(i, line)
start = match_i.end()
print(line[start])
I want to replace a line in a file but my code doesn't do what I want. The code doesn't change that line. It seems that the problem is the space between ALS and 4277 characters in the input.txt. I need to keep that space in the file. How can I fix my code?
A part part of input.txt:
ALS 4277
Related part of the code:
for lines in fileinput.input('input.txt', inplace=True):
print(lines.rstrip().replace("ALS"+str(4277), "KLM" + str(4945)))
Desired output:
KLM 4945
Using the same idea that other user have already pointed out, you could also reproduce the same spacing, by first matching the spacing and saving it in a variable (spacing in my code):
import re
with open('input.txt') as f:
lines = f.read()
match = re.match(r'ALS(\s+)4277', lines)
if match != None:
spacing = match.group(1)
lines = re.sub(r'ALS\s+4277', 'KLM%s4945'%spacing, lines.rstrip())
print lines
As the spaces vary you will need to use regex to account for the spaces.
import re
lines = "ALS 4277 "
line = re.sub(r"(ALS\s+4277)", "KLM 4945", lines.rstrip())
print(line)
Try:
with open('input.txt') as f:
for line in f:
a, b = line.strip().split()
if a == 'ALS' and b == '4277':
line = line.replace(a, 'KLM').replace(b, '4945')
print(line, end='') # as line has '\n'
I am working on python program where the goal is to create a tool that takes the first word word from a file and put it beside another line in a different file.
This is the code snippet:
lines = open("x.txt", "r").readlines()
lines2 = open("c.txt", "r").readlines()
for line in lines:
r = line.split()
line1 = str(r[0])
for line2 in lines2:
l2 = line2
rn = open("b.txt", "r").read()
os = open("b.txt", "w").write(rn + line1+ "\t" + l2)
but it doesn't work correctly.
My question is that I want to make this tool to take the first word from a file, and put it beside a line in from another file for all lines in the file.
For example:
File: 1.txt :
hello there
hi there
File: 2.txt :
michal smith
takawa sama
I want the result to be :
Output:
hello michal smith
hi takaua sama
By using the zip function, you can loop through both simultaneously. Then you can pull the first word from your greeting and add it to the name to write to the file.
greetings = open("x.txt", "r").readlines()
names = open("c.txt", "r").readlines()
with open("b.txt", "w") as output_file:
for greeting, name in zip(greetings, names):
greeting = greeting.split(" ")[0]
output = "{0} {1}\n".format(greeting, name)
output_file.write(output)
Yes , like Tigerhawk indicated you want to use zip function, which combines elements from different iterables at the same index to create a list of tuples (each ith tuple having elements from ith index from each list).
Example code -
lines = open("x.txt", "r").readlines()
lines2 = open("c.txt", "r").readlines()
newlines = ["{} {}".format(x.split()[0] , y) for x, y in zip(lines,lines2)]
with open("b.txt", "w") as opfile:
opfile.write(newlines)
from itertools import *
with open('x.txt', 'r') as lines:
with open('c.txt', 'r') as lines2:
with open('b.txt', 'w') as result:
words = imap(lambda x: str(x.split()[0]), lines)
results = izip(words, lines2)
for word, line in results:
result_line = '{0} {1}'.format(word, line)
result.write(result_line)
This code will work without loading files into memory.
the values in my current .txt file :
1256
5679
t67
y1
890
the end result i want is these values between '' and on 1 line without removing the spaces and distanced by a comma:
'1256','5679',' t67','y1',' 890'
What i tried:
output = r""
file_name = r"path"
string_to_add = "','"
with open(file_name, 'r') as f:
file_lines = [''.join([x.strip(), string_to_add, '\n']) for x in f.readlines()]
with open(file_name, 'w') as f:
f.writelines(file_lines)
but it didn't work, it removed the spaces and didn't add the values next tp each other
The reason you are getting a wrong result is the x.strip() this will remove all whitespace.
Couldn't you just do:
#each line is now an entry in a list
data = open(sys.argv[1], 'r').readlines()
#remove newlines before writing to output
line = ','.join(data)
OUT = open('output.txt','w')
OUT.write(line)
OUT.close()
x.strip() removes spaces, endlines and tabs from both sides of x. You want to use x.rstrip() instead to only remove trailing spaces and endlines.
You can also directly use your string_to_add to join on, instead of what you currently do.
with open(file_name, 'r') as f:
file_lines = string_to_add.join(x.rstrip() for x in f.readlines())
with open(file_name, 'w') as f:
f.writeline(file_lines)
The method String.strip() works this way: https://www.programiz.com/python-programming/methods/string/strip
It removes the characters you have specified inside of the parenthesis from the string, so not giving any character will make the method to return the same string, I suppose.
The method String.join works this way: https://www.w3schools.com/python/ref_string_join.asp
It takes the String and uses it to join an iterable of Strings (list, tuple,...)
So what you are doing here is simply adding a ',' to each line of the file.
Try using this:
with open(file_name, 'r') as f:
for line in f.readlines():
myString += '"' + line + '",'
myString = myString[:-1]