Python Parse GitHub File

Python Parse GitHub File - python

I am looking for more information regarding parsing files in Python, specifically from someone's GitHub. For instance, if Person A has a GitHub account with a file with the contents:
name = Person A
my script = scriptA.sh
script output = yarn output
my other files = fileA, fileB
I would want to be able to access this information and store with my Python script.
This is not for a class or anything so I am struggling to find good, clear, beginner level information going over concepts like this. Does anyone have any advice here? I have basic parsing and Python understanding, but I want to advance this. Here is some pseudocode I am using to try to brainstorm.
def parseAddFile(filename):
with open(filename, 'r') as file:
lines = file.readlines() # this should read all the lines of the file
dict_of_contents={} # this will then put the contents into a dictionary
for parameter in line
name =
etc...
Just looking to grow my knowledge, not asking for answers only.

Related

Read paths in csv and open in unix

I need to read paths in a csv and go to these file paths in unix. I am wondering if there is any way to do that using unix or python commands. I am unsure of how to proceed and I am not able to find much resources in the net either.
The number of rows in the excel.csv is a lot and similar to below. I need to open the excel.csv and then read the first line and go to this file path. Once this file is opened using the file path, I need to be able to read the file and extract out certain information. I tried using python for this but I am unable to find much information and I am wondering if I can use unix commands to solve this. I am clueless on how to proceed for this one so I would appreciate any reference or help using either python or unix commands. Thank you!
/folder/file1
/folder/file2
/folder/file3

It shouldn't be very difficult to do this in Python, as reading csv files is part of the standard library. Your code could look something like this:
with open('data.csv', newline='') as fh:
# depending if the first row describes the header or not,
# you can also use the simple csv.reader here
for row in csv.DictReader(fh, strict=True):
# again, if you use the simple csv.reader, you'll have to access
# the column via index instead
file_path = row['path']
with open(file_path, 'r') as fh2:
data = fh2.read()
# do something with data

Python libtorrent, get file list names

I am using libtorrent for python 3.6 . I just want to get any file names that downloaded with a session, e.g. the folder name, the files name etc.
I searched around the web didn't come across anything. I am using the follow example:
https://www.libtorrent.org/python_binding.html
So when the download progress finish, i want to know what files this session downloaded. How can achieve that? Thanks in advance!

Finally found the answer, the code is:
handle = libtorrent.add_magnet_uri(session, magnetLink,params)
session.start_dht()
while not handle.has_metadata():
time.sleep(1)
torinfo = handle.get_torrent_info()
for x in range(torinfo.files().num_files()):
print(torinfo.files().file_path(x))
The code above prints the file names that came with the magnet file.

Read msi with python msilib

I need to read an msi file and make some queries to it. But it looks like despite it is a standard lib for python, it has poor documentation.
To make queries I have to know database schema and I can't find any examples or methods to get it from the file.
Here is my code I'm trying to make work:
import msilib
path = "C:\\Users\\Paul\\Desktop\\my.msi" #I cannot share msi
dbobject = msilib.OpenDatabase(path, msilib.MSIDBOPEN_READONLY)
view = dbobject.OpenView("SELECT FileName FROM File")
rec = view.Execute(None)
r = v.Fetch()
And the rec variable is None. But I can open the MSI file with InstEd tool and see that File is present in the tables list and there are a lot of records there.
What I'm doing wrong?

Your code is suspect, as the last line will throw a NameError in your sample. So let's ignore that line.
The real problem is that view.Execute returns nothing of use. Under the hoods, the MsiViewExecute function only returns success or failure. After you call that, you then need to call view.Fetch, which may be what your last line intended to do.

Optparse to find a string

I have a mysql database and I am trying to print all the test result from a specific student. I am trying to create a command line where I enter the username and then it will shows his/her test result.
I visited this page already but I couldn't get my answer.
optparse and strings
#after connecting to mysql
cursor.execute("select * from database")
def main():
parser = optparse.OptionParser()
parser.add_option("-n", "--name", type="string", help = "student name")
(options, args) = parser.parse_args()
studentinfo = []
f = open("Index", "r")
#Index is inside database, it is a folder holds all kinds of files

Well, the first thing you should do is not use optparse, as it's deprecated - use argparse instead. The help I linked you to is quite useful and informative, guiding you through creating a parser and setting the different options. After reading through it you should have no problem accessing the variables passed from the command line.
However, there are other errors in your script as well that will prevent it from running. First, you can't open a directory with the open() command - you need to use os.listdir() for that, then read the resulting list of files. It is also very much advisable to use a context manager when open()ing files:
filelist = os.listdir("/path/to/Index")
for filename in filelist:
with open(filename, "r") as f:
for line in f:
# do stuff with each line
This way you don't need to worry about closing the file handler later on, and it's just a generally cleaner way of doing things.
You don't provide enough information in your question as to how to get the student's scores, so I'm afraid I can't help you there. You'll (I assume) have to connect the data that's coming out of your database query with the files (and their contents) in the Index directory. I suspect that if the student scores are kept in the DB, then you'll need to retrieve them from the DB using SQL, instead of trying to read raw files in the filesystem. You can easily get the student of interest's name from the command line, but then you'll have to interpolate that into a SQL query to find the correct table, select the rows from the table corresponding to the student's test scores, then process the results with Python to print out a pretty summary.
Good luck!

How to concatenate several Javascript files into one file using Python

I would like to know how I can use Python to concatenate multiple Javascript files into just one file.
I am building a component based engine in Javascript, and I want to distribute it using just one file, for example, engine.js.
Alternatively, I'd like the users to get the whole source, which has a hierarchy of files and directories, and with the whole source they should get a build.py Python script, that can be edited to include various systems and components in it, which are basically .js files in components/ and systems/ directories.
How can I load files which are described in a list (paths) and combine them into one file?
For example:
toLoad =
[
"core/base.js",
"components/Position.js",
"systems/Rendering.jd"
]
The script should concatenate these in order.
Also, this is a Git project. Is there a way for the script to read the version of the program from Git and then write it as a comment at the beginning?

This will concatenate your files:
def read_entirely(file):
with open(file, 'r') as handle:
return handle.read()
result = '\n'.join(read_entirely(file) for file in toLoad)
You may then output them as necessary, or write them using code similar to the following:
with open(file, 'w') as handle:
handle.write(result)

How about something like this?
final_script = ''
for script_name in toLoad:
with open(script_name, 'r') as f:
final_script += ('\n' + f.read())
with open('engine.js', 'w') as f:
f.write(final_script)

You can do it yourself, but this is a real problem that real tools are solving more sophisticatedly. Consider "JavaScript Minification", e.g. using http://developer.yahoo.com/yui/compressor/

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.