Unable to separate file names with file contents in Python socket programming

Unable to separate file names with file contents in Python socket programming - python

I'm trying to send data from the client's side to the server's side using TCP socket programming.
What I did is I read in the file names in the client's directory and then send the file name to the server's side by sending clientSocket.send("FILE "+fileToTransfer + "\n"). Then on the server's side, I use regex to get the file name out.
However, the client will always send the "FILE fileName.txt" and the file's contents together. So I suppose at the server's side, I will have to use regex to separate the file name with the file's contents.
So what I did at the server's side is to use getFileName = re.match(r'FILE (.*)(\n)(.*)',data) to get the file name and its contents separately. Unfortunately, (.*) does not include line breaks.
In that case, how do I separate the file contents with the file name? Is there a way to get the client's side to send the file name first then wait for server's side to get the file name before the file contents can be sent over? Or is there a regex which I can use so that I can separate the file name and the file contents?

You can send the file size along with the filename. This allows the server side to know how many bytes it should read. And in this case you don't need to read the entire file content into memory, you can read it chunk-by-chunk until the file size is exhausted (zero-ed out) and write chunks to disk. Something like this:
## client side
# get file size here, for example:
# filesize = os.path.getfilesize(filepath)
sock.sendall("FILE %s %d\n" % (filename, filesize))
sock.sendall(fd.read())
...
## server side
# error handling is left out
header = ""
while True:
d = sock.recv(1)
if d == '\n':
break
header += d
filesize = int(header.split()[-1])
# or search for the last space in header
# and get a substring of header as filename
filename = "".join(header.split()[1:-1])
data = ""
while filesize > 0:
chunk = sock.recv(1024) # or any amount of data
filesize -= chunk
data += chunk
Or you can give your regex up and just find the first \n:
## client side
sock.sendall("FILE %s\n")
sock.sendall(fd.read())
...
## server side
# data = read data here
newline = data.find('\n')
assert newline != -1 # some error handling here
header = data[newline]
filename = header[len("FILE "):]
content = data[newline+1:]

If you just want to fix the regex, change the line to:
getFileName = re.match(r'FILE (.*?)(\n)(.*)', data, re.DOTALL)
The DOTALL flag makes the . math newlines as well. The extra ? I added makes the * multiplier non-greedy, i.e., it'll stop at the first newline it sees (I assume newlines cannot be a part of the file name).
What you should probably do is send the file name as part of a header or something.

Related

How to edit text file in server using python

I want to edit a line in a text file in a Linux server using python. The process involves following steps
Telnet to the server (using telnetlib)
Go to the required directory
open the text file in the directory
set or unset the flag (YES or NO) of the variable in the text file based on the requirement
save the file and exit
I'm able to automate until step 2. However, I'm stuck at step 3 through 5.
I tried to mimic the steps I follow manually (using vim editor). But I'm not able to perform the 'ESC', replace and ':wq!' steps. Is there an alternative procedure to edit the file or any ways to improve upon mimicking the manual process
I have added my code here
host = input("Enter the IP address:")
port = input("Enter the port:")
tn = telnetlib.Telnet(host,port)
tn.write(b'\n')
tn.read_until(b"login:")
tn.write(b"admin" + b'\n')
tn.read_until(b"Password:")
tn.write(b"admin" + b'\n')
tn.write(b"version" + b'\n')
tn.write(b"path/to/file/" + b'\n')
# OPEN THE FILE and SET or RESET THE FLAG and CLOSE
with in_place.InPlace('filename.txt') as file:
for line in file:
line = line.replace('line_to_change', 'changed_data')
file.write(line)
print('Task executed')
I tried using the in-place library to set the flag but the programme is looking for the file in my local machine rather in the server. So it throws an error message indicating that the file is not present.

If you are able to connect to your remote server, the rest should work as follows:
with open('path/to/file','r') as fr:
data = fr.readlines() # returns list of lines
changed_data = ["changed_data\n" if line=="line_to_change\n" else line
for line in data]
with open('path/to/file','w') as fw:
for line in changed_data:
fw.write(line) # write the lines back to the back

Pytsk - Sending files to a server from a disk image

I am trying to send each file from a disk image to a remote server using paramiko.
class Server:
def __init__(self):
self.ssh = paramiko.SSHClient()
self.ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
self.ssh.connect('xxx', username='xxx', password='xxx')
def send_file(self, i_node, name):
sftp = self.ssh.open_sftp()
serverpath = '/home/paul/Testing/'
try:
sftp.chdir(serverpath)
except IOError:
sftp.mkdir(serverpath)
sftp.chdir(serverpath)
serverpath = '/home/Testing/' + name
sftp.putfo(fs.open_meta(inode = i_node), serverpath)
However when I run this I get an error saying that "pytsk.File has no attribute read".
Is there any other way of sending this file to the server?

After a quick investigation I think I found what your problem is. Paramiko's sftp.putfo expects a Python file object as the first parameter. The file object of Pytsk3 is a completely different thing. Your sftp object tries to perform "read" on this, but Pytsk3 file object does not have a method "read", hence the error.
You could in theory try expanding Pytsk3.File class and adding this method but I would not hold my breath that it actually works.
I would just read the file to a temporary one and send that. Something like this (you would need to make temp file name handling more clever and delete the file afterwards but you will get the idea):
serverpath = '/home/Testing/' + name
tmp_path = "/tmp/xyzzy"
file_obj = fs.open_meta(inode = i_node)
# Add here tests to confirm this is actually a file, not a directory
tha = open(tmp_path, "wb")
tha.write(file_obj.read_random(0, file_obj.info.meta.size))
tha.close()
rha = open(tmp_path, "rb")
sftp.putfo(rha, serverpath)
rha.close()
# Delete temp file here
Hope this helps. This will read the whole file in memory from fs image to be written to temp file, so if the file is massive you would run out of memory.
To work around that, you should read the file in chunks looping through it with read_random in suitable chunks (the parameters are start offset and amount of data to read), allowing you to construct the temp file in a chunks of for example a couple of megabytes.
This is just a simple example to illustrate your problem.
Hannu

looking for line and linenumber in text file

I'm making a extra function for my chat program which allows you to register by typing a specific command that makes the host script save your peername and a name you insert in the terminal, its saved like this 57883:Jack in a txt file (names.txt) on the host machine. if a number of people have registered it'll look like this
57883:jack
57884:bob
57885:connor
57886:james
57887:zzhshsb93838
57887:ryan
when someone sends a message i want to know if his/her name is registered and if so, get the name to send to the client, so instead of seeing the peername the client will see the name of the person sending the message.
in order to do that i need to know if the peername is in the file and if so; where in the file, in which line. i've got this so far:
peer = sock.getpeername()
with open('names.txt', 'r') as d:
lines = d.readlines()
for peer in lines:
and i don't know how to find out in which line it was found, and when i know that how to seperate 57883 and ack and select jack and save it. Cheers!

with open("names.txt", "r") as f:
for i, line in enumerate(f):
numb, name = line.rstrip().split(":")
print i, numb, name
You can ask Python to provide a tuple of line number and line content for each line of the file (see enumerate(f)). Then you remove the line terminator (see line.rstrip()) and split the line to parts separated by the provided character (see split(":")). You have all three components available then.

Getting "newline inside string" while reading the csv file in Python?

I have this utils.py file in Django Architecture:
def range_data(ip):
r = []
f = open(os.path.join(settings.PROJECT_ROOT, 'static', 'csv ',
'GeoIPCountryWhois.csv'))
for num,row in enumerate(csv.reader(f)):
if row[0] <= ip <= row[1]:
r.append([r[4]])
return r
else:
continue
return r
Here the ip parameter is just the IPv4 Address, I am using open source MAXMIND GeoIPCountrywhois.csv file.
Some starting content of GeopIOCountrywhois.csv:
"1.0.0.0","1.0.0.255","16777216","16777471","AU","Australia"
"1.0.1.0","1.0.3.255","16777472","16778239","CN","China"
"1.0.4.0","1.0.7.255","16778240","16779263","AU","Australia"
"1.0.8.0","1.0.15.255","16779264","16781311","CN","China"
"1.0.16.0","1.0.31.255","16781312","16785407","JP","Japan"
"1.0.32.0","1.0.63.255","16785408","16793599","CN","China"
"1.0.64.0","1.0.127.255","16793600","16809983","JP","Japan"
"1.0.128.0","1.0.255.255","16809984","16842751","TH","Thailand"
I have also read about the issue, But didn't found so much understandable. Would you please help me to solve that error?
According to my method in utils, I am checking country name of paasing parameter IP address to the method.

had similar problem earlier today, there was an end quote missing from a line and the solution is by instructing reader to perform no special processing of quote characters (quoting=csv.QUOTE_NONE).

You can preprocess the csv by removing the newline like below.
import csv
content = open("GeoIPCountryWhois.csv", "r").read().replace('\r\n','\n')
with open("GeoIPCountryWhois2.csv", "w") as g:
g.write(content)
Then Use GeoIPCountryWhois2 for csv reader.
A wild Guess using a lineterminator may solve your problem
for num,row in enumerate(csv.reader(f,lineterminator='\n'))
See also: http://docs.python.org/lib/csv-fmt-params.html

You must open your files as binary:
def range_data(ip):
r = []
f = open(os.path.join(settings.PROJECT_ROOT, 'static', 'csv ',
'GeoIPCountryWhois.csv'), 'rb')
for num,row in enumerate(csv.reader(f)):
# Your things.
Note the 'rb' mode there; otherwise the file could be opened with native line endings, and the CSV reader doesn't handle the various forms very well. Certainly the copy of GeoIPCountryWhois.csv that I downloaded has clean \n line endings.
This is documented for the .reader() method:
If csvfile is a file object, it must be opened with the ‘b’ flag on platforms where that makes a difference.
If, however, your csv file is so corrupted as to still contain unexpected newline characters in unexpected places, use this file subclass instead as a stop-gap measure:
class CleanlinesFile(file):
def next(self):
line = super(CleanlinesFile, self).next()
return line.replace('\r', '').replace('\n', '') + '\n'
This class guarantees there will be no newlines anywhere in the returned results except as the very last character (just the way the csv module wants it). Use it instead of the open call; the 'rb' mode modifier becomes optional in this case:
def range_data(ip):
r = []
f = CleanlinesFile(os.path.join(settings.PROJECT_ROOT, 'static', 'csv ',
'GeoIPCountryWhois.csv'))
for num,row in enumerate(csv.reader(f)):
# Your things.

process large text file in python

I have a very large file (3.8G) that is an extract of users from a system at my school. I need to reprocess that file so that it just contains their ID and email address, comma separated.
I have very little experience with this and would like to use it as a learning exercise for Python.
The file has entries that look like this:
dn: uid=123456789012345,ou=Students,o=system.edu,o=system
LoginId: 0099886
mail: fflintstone#system.edu
dn: uid=543210987654321,ou=Students,o=system.edu,o=system
LoginId: 0083156
mail: brubble#system.edu
I am trying to get a file that looks like:
0099886,fflintstone#system.edu
0083156,brubble#system.edu
Any tips or code?

That actually looks like an LDIF file to me. The python-ldap library has a pure-Python LDIF handling library that could help if your file possesses some of the nasty gotchas possible in LDIF, e.g. Base64-encoded values, entry folding, etc.
You could use it like so:
import csv
import ldif
class ParseRecords(ldif.LDIFParser):
def __init__(self, csv_writer):
self.csv_writer = csv_writer
def handle(self, dn, entry):
self.csv_writer.writerow([entry['LoginId'], entry['mail']])
with open('/path/to/large_file') as input, with open('output_file', 'wb') as output:
csv_writer = csv.writer(output)
csv_writer.writerow(['LoginId', 'Mail'])
ParseRecords(input, csv_writer).parse()
Edit
So to extract from a live LDAP directory, using the python-ldap library you would want to do something like this:
import csv
import ldap
con = ldap.initialize('ldap://server.fqdn.system.edu')
# if you're LDAP directory requires authentication
# con.bind_s(username, password)
try:
with open('output_file', 'wb') as output:
csv_writer = csv.writer(output)
csv_writer.writerow(['LoginId', 'Mail'])
for dn, attrs in con.search_s('ou=Students,o=system.edu,o=system', ldap.SCOPE_SUBTREE, attrlist = ['LoginId','mail']:
csv_writer.writerow([attrs['LoginId'], attrs['mail']])
finally:
# even if you don't have credentials, it's usually good to unbind
con.unbind_s()
It's probably worthwhile reading through the documentation for the ldap module, especially the example.
Note that in the example above, I completely skipped supplying a filter, which you would probably want to do in production. A filter in LDAP is similar to the WHERE clause in a SQL statement; it restricts what objects are returned. Microsoft actually has a good guide on LDAP filters. The canonical reference for LDAP filters is RFC 4515.
Similarly, if there are potentially several thousand entries even after applying an appropriate filter, you may need to look into the LDAP paging control, though using that would, again, make the example more complex. Hopefully that's enough to get you started, but if anything comes up, feel free to ask or open a new question.
Good luck.

Assuming that the structure of each entry will always be the same, just do something like this:
import csv
# Open the file
f = open("/path/to/large.file", "r")
# Create an output file
output_file = open("/desired/path/to/final/file", "w")
# Use the CSV module to make use of existing functionality.
final_file = csv.writer(output_file)
# Write the header row - can be skipped if headers not needed.
final_file.writerow(["LoginID","EmailAddress"])
# Set up our temporary cache for a user
current_user = []
# Iterate over the large file
# Note that we are avoiding loading the entire file into memory
for line in f:
if line.startswith("LoginID"):
current_user.append(line[9:].strip())
# If more information is desired, simply add it to the conditions here
# (additional elif's should do)
# and add it to the current user.
elif line.startswith("mail"):
current_user.append(line[6:].strip())
# Once you know you have reached the end of a user entry
# write the row to the final file
# and clear your temporary list.
final_file.writerow(current_user)
current_user = []
# Skip lines that aren't interesting.
else:
continue

Again assuming your file is well-formed:
with open(inputfilename) as inputfile, with open(outputfilename) as outputfile:
mail = loginid = ''
for line in inputfile:
line = inputfile.split(':')
if line[0] not in ('LoginId', 'mail'):
continue
if line[0] == 'LoginId':
loginid = line[1].strip()
if line[0] == 'mail':
mail = line[1].strip()
if mail and loginid:
output.write(loginid + ',' + mail + '\n')
mail = loginid = ''
Essentially equivalent to the other methods.

To open the file you'll want to use something like the with keyword to ensure it closes properly even if something goes wrong:
with open(<your_file>, "r") as f:
# Do stuff
As for actually parsing out that information, I'd recommend building a dictionary of ID email pairs. You'll also need a variable for the uid and the email.
data = {}
uid = 0
email = ""
To actually parse through the file (the stuff run while your file is open) you can do something like this:
for line in f:
if "uid=" in line:
# Parse the user id out by grabbing the substring between the first = and ,
uid = line[line.find("=")+1:line.find(",")]
elif "mail:" in line:
# Parse the email out by grabbing everything from the : to the end (removing the newline character)
email = line[line.find(": ")+2:-1]
# Given the formatting you've provided, this comes second so we can make an entry into the dict here
data[uid] = email
Using the CSV writer (remember to import csv at the beginning of the file) we can output like this:
writer = csv.writer(<filename>)
writer.writerow("User, Email")
for id, mail in data.iteritems:
writer.writerow(id + "," + mail)
Another option is to open the writer before the file, write the header, then read the lines from the file at the same time as writing to the CSV. This avoids dumping the information into memory, which might be highly desirable. So putting it all together we get
writer = csv.writer(<filename>)
writer.writerow("User, Email")
with open(<your_file>, "r") as f:
for line in f:
if "uid=" in line:
# Parse the user id out by grabbing the substring between the first = and ,
uid = line[line.find("=")+1:line.find(",")]
elif "mail:" in line:
# Parse the email out by grabbing everything from the : to the end (removing the newline character)
email = line[line.find(": ")+2:-1]
# Given the formatting you've provided, this comes second so we can make an entry into the dict here
writer.writerow(iid + "," + email)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Unable to separate file names with file contents in Python socket programming - python

Related

How to edit text file in server using python

Pytsk - Sending files to a server from a disk image

looking for line and linenumber in text file

Getting "newline inside string" while reading the csv file in Python?

process large text file in python

Categories

Resources