Read msi with python msilib - python

I need to read an msi file and make some queries to it. But it looks like despite it is a standard lib for python, it has poor documentation.
To make queries I have to know database schema and I can't find any examples or methods to get it from the file.
Here is my code I'm trying to make work:
import msilib
path = "C:\\Users\\Paul\\Desktop\\my.msi" #I cannot share msi
dbobject = msilib.OpenDatabase(path, msilib.MSIDBOPEN_READONLY)
view = dbobject.OpenView("SELECT FileName FROM File")
rec = view.Execute(None)
r = v.Fetch()
And the rec variable is None. But I can open the MSI file with InstEd tool and see that File is present in the tables list and there are a lot of records there.
What I'm doing wrong?

Your code is suspect, as the last line will throw a NameError in your sample. So let's ignore that line.
The real problem is that view.Execute returns nothing of use. Under the hoods, the MsiViewExecute function only returns success or failure. After you call that, you then need to call view.Fetch, which may be what your last line intended to do.

Related

Accessing Windows File "Tags" metadata in Python

How do I access the tags attribute here in the Windows File Properties panel?
Are there any modules I can use? Most google searches yield properties related to media files, file access times, but not much related to metadata properties like Tags, Description etc.
the exif module was able to access a lot more properties than most of what I've been able to find, but still, it wasn't able to read the 'Tags' property.
The Description -> Tags property is what I want to read and write to a file.
There's an entire module dedicated to exactly what I wanted: IPTCInfo3.
import iptcinfo3, os, sys, random, string
# Random string gennerator
rnd = lambda length=3 : ''.join(random.choices(list(string.ascii_letters), k=length))
# Path to the file, open a IPTCInfo object
path = os.path.join(sys.path[0], 'DSC_7960.jpg')
info = iptcinfo3.IPTCInfo(path)
# Show the keywords
print(info['keywords'])
# Add a keyword and save
info['keywords'] = [rnd()]
info.save()
# Remove the weird ghost file created after saving
os.remove(path + '~')
I'm not particularly sure what the ghost file is or does, it looks to be an exact copy of the original file since the file size remains the same, but regardless, I remove it since it's completely useless to fulfilling the read/write purposes of metadata I need.
There have been some weird behaviours I've noticed while setting the keywords, like some get swallowed up into the file (the file size changed, I know they're there, but Windows doesn't acknowledge this), and only after manually deleting the keywords do they reappear suddenly. Very strange.

Evaluating File Paths in Excel

I have an ever increasing list of file paths (i have around 5000 records now) in Excel. More specifically, I have a certain unique identifier in column A and in Column B, I have a file path that leads to a picture for that unique identifier.
The process of adding the file paths is very manual and sometimes mistakes occur. So, I wanted to create a code that goes through each one of this file paths and if file path doesn't open/returns an error, to store these values in a list so that I can go directly to those and fix the file path.
I was thinking of writing a Python code that checks the File Path in Google Chrome URL (I have found it to work better than directly clicking the Hyperlink in Excel), but it's been a while since I have used Python and don't know where to start.
Any recommendation/ideas of how to achieve this?
Thank you,
Ricardo G.
To read excel files, I prefer to use the pandas library, specifically the read_excel function. You can also check if a filepath is a valid, existing file in your filesystem using the os.path module. os.path.isfile returns True if the provided path points to an actual file, so you want to use a list comprehension with a filter to only have filepaths where that is not the case.
import pandas as pd
import os
df = pd.read_excel('path/to/excel')
bad_files = [fp for fp in df['filepath_column'] if !os.path.isfile(path)]
I'm not sure what you mean by check with google chrome, but if you're talking about local files, this should work well for you.

Get file id in windows with python?

Hello I want to get the file id from files on windows with python. When I searched I could only find how to do it in other languages. Does anybody know how I can achieve this in python?
As far as I have looked and researched, there is no such file id available. But instead, you can have the creation date on Windows and Mac, and the last modified on Linux. These two are usually sufficient to find unique files, even if they are renamed, altered, or whatever.
Here's how to do it, along with the source SO thread I found the solution.
import os
import platform
def creation_date(path_to_file):
"""
Try to get the date that a file was created, falling back to when it was
last modified if that isn't possible.
See http://stackoverflow.com/a/39501288/1709587 for explanation.
"""
if platform.system() == 'Windows':
return os.path.getctime(path_to_file)
else:
stat = os.stat(path_to_file)
try:
return stat.st_birthtime
except AttributeError:
# We're probably on Linux. No easy way to get creation dates here,
# so we'll settle for when its content was last modified.
return stat.st_mtime
import os
path_to_file = r"path_to_your_file"
file_id = os.stat(path_to_file, follow_symlinks=False).st_ino
print(hex(file_id))
to check the result from the commandline:
c:\> fsutil file queryfileid path_to_your_file
so in Python you can also use
print(os.popen(fr"fsutil file queryfileid path_to_your_file").read())
or when you have hardlinks:
print(os.popen(fr"fsutil hardlink list path_to_your_file").read())
to find the filename with an id:
print(os.popen(fr'fsutil file queryFileNameById c:\ the_file_id').read())

Python FTP: parseable directory listing

I'm using the Python FTP lib for the first time. My goal is simply to connect to an FTP site, get a directory listing, and then download all files which are newer than a certain date - (e.g. download all files created or modified within the last 5 days, for example)
This turned out to be a bit more complicated than I expected for a few reasons. Firstly, I've discovered that there is no real "standard" FTP file list format. Most FTP sites conventionally use the UNIX ls format, but this isn't guaranteed.
So, my initial thought was to simply parse the UNIX ls format: it's not so bad after all, and it seems most mainstream FTP servers will use it in response to the LIST command.
This was easy enough to code with Python's ftplib:
import ftplib
def callback(line):
print(line)
ftp = ftplib.FTP("ftp.example.com")
result = ftp.login(user = "myusername", passwd = "XXXXXXXX")
dirlist = ftp.retrlines("LIST", callback )
This works, except the problem is that the date given in the UNIX list format returned by the FTP server I'm dealing with doesn't have a year. A typical entry is:
-rw-rw-r-- 1 user user 1505581 Dec 9 21:53 somefile.txt
So the problem here is that I'd have to code in extra logic to sort of "guess" if the date refers to the current year or not. Except really, I'd much rather not code some complex logic like that when it seems so unnecessary - there's no reason the FTP server shouldn't be able to give me the year.
Okay, so after Googling around for some alternative ways to get LIST information, I've found that many FTP servers support the MLST and MLSD command, which apparently provides a directory listing in a "machine-readable" format, i.e. a list format which is much more amenable to automatic processing. Great. So, I try the following:
dirlist = ftp.sendcmd("MLST")
print(dirlist)
This produces a single line response, giving me data about the current working directory, but NOT a list of files.
250-Start of list for /
modify=20151210094445;perm=flcdmpe;type=cdir;unique=808U6EC0051;UNIX.group=1003;UNIX.mode=0775;UNIX.owner=1229; /
250 End of list
So this looks great, and easy to parse, and it also has a modify date with the year. Except it seems the MLST command is showing information about the directory itself, rather than a listing of files.
So, I've Googled around and read the relevant RFCs, but can't seem to figure out how to get a listing of files in "MLST" format. It seems the MLSD command is what I want, but I get a 425 error when I try that:
File "temp8.py", line 8, in <module>
dirlist = ftp.sendcmd("MLSD")
File "/usr/lib/python3.2/ftplib.py", line 255, in sendcmd
return self.getresp()
File "/usr/lib/python3.2/ftplib.py", line 227, in getresp
raise error_temp(resp)
ftplib.error_temp: 425 Unable to build data connection: Invalid argument
So how can I get a full directory listing in MLST/MLSD format here?
There is another module ftputil which is built based on ftplib, and has many features emulating os, os.path, shutil. I found it pretty easy to use and robust in related operation. Maybe you could give it a try.
As for your purpose, the introduction codes solves it exactly.
you could try this, and see if you can get what you need.
print(ftp.mlst('directory'))
I am working on something similar where i need to parse the content of directory and all sub directories within. However the server that I am working with did not allow mlst command, so i accomplished what i need by,
parse the main directory content
for loop through main directory content
Append for loop output to pandas DataFrame.
test = pd.Series('ftp.nlst('/target directory/'))
df_server_content = pd.DataFrame()
for i in test:
data_dir = '/target directory/' + i
server_series = pd.Series(ftp.nlst(data_dir))
df_server_content = df_server_content.append(server_series)

Python3:Save File to Specified Location

I have a rather simple program that writes HTML code ready for use.
It works fine, except that if one were to run the program from the Python command line, as is the default, the HTML file that is created is created where python.exe is, not where the program I wrote is. And that's a problem.
Do you know a way of getting the .write() function to write a file to a specific location on the disc (e.g. C:\Users\User\Desktop)?
Extra cool-points if you know how to open a file browser window.
The first problem is probably that you are not including the full path when you open the file for writing. For details on opening a web browser, read this fine manual.
import os
target_dir = r"C:\full\path\to\where\you\want\it"
fullname = os.path.join(target_dir,filename)
with open(fullname,"w") as f:
f.write("<html>....</html>")
import webbrowser
url = "file://"+fullname.replace("\\","/")
webbrowser.open(url,True,True)
BTW: the code is the same in python 2.6.
I'll admit I don't know Python 3, so I may be wrong, but in Python 2, you can just check the __file__ variable in your module to get the name of the file it was loaded from. Just create your file in that same directory (preferably using os.path.dirname and os.path.join to remain platform-independent).

Categories

Resources