Resolving a relative path from py:match in a genshi template - python

<py:match path="foo">
<?python
import os
href = select('#href').render()
SOMEWHERE = ... # what file contained the foo tag?
path = os.path.abspath(os.path.join(os.path.dirname(SOMEWHERE), href)
f = file(path,'r')
# (do something interesting with f)
?>
</py:match>
...
<foo href="../path/relative/to/this/template/abcd.xyz"/>
What should go as "somewhere" above? I want that href attribute to be relative to the file with the foo tag in it, like href attributes on other tags.
Alternatively, what file contained the py:match block? This is less good because it may be in a different directory from the file with the foo tag.
Even less good: I could supply the path of the file I'm rendering as a context argument from outside Genshi, but that might be in a different directory from both of the above.

You need to make sure that the driver program (i.e., the Python program that parses the input file) runs in the directory of the file containing the foo tag. Otherwise, you need to pass down the relative path (i.e., how to get from the directory in which the reader runs to the directory of the file being read) as a context argument to your Python code and add it to the os.path.join command.
With this setup (and using Genshi 0.6 installed on MacOS X 10.6.3 via the Fink package genshi-py26) the command os.getcwd() returns the current working directory of the file containing the foo tag.
For such complicated path constructs I also strongly recommend to use path=os.path.normpath(path), since you may not want such things to leak in your resulting HTML code.

Related

Why os.path.realpath doesn't work properly

i just downloaded a file called "N_PR_8705_004A_.doc" in my "Downloads" folder and i want to put it into my "Stage NLP" folder using os. I know how to do it without os but i'd like that shit to work it's faster and it simply doesnt. First i tried to get the path of my file doing this:
import os
os.path.dirname(os.path.abspath("N_PR_8705_004A_.doc"))
# or os.path.realpath it's the same
and the result i get is:
'C:\\Users\\f002722\\Stage NLP'
whereas when i do list all the files in this folder doing:
os.listdir("C:\\Users\\f002722\\Stage NLP")
you clearly see it is simply not there:
['.ipynb_checkpoints',
'ADR service study - D2 (1st part).pdf',
'basetal.py',
'Codes test',
'Cours NLP.ipynb',
'e Deorbit',
'edot CDF study.pdf',
'edot_v5.pdf',
'Entrainement.ipynb',
'ESA edot workshop May 6th 2014 - Summary.msg',
'ESA_edotWorkshop-_Envisat_attitude-Copy1',
'ESA_edotWorkshop-_Envisat_attitude.pdf',
'ESA_edotWorkshop_GNC_.pdf',
'ESA_INNOCENTI_Challenges.pdf',
'ESA_Robin_Biesbroek_edot.pdf',
'GMV_edot_Symposium.pdf',
'JOP_edotWorkshop.pdf',
'KT_HAARMANN_Edot.pdf',
'MDA_edot_Symposium_-_Robotic_Capture.pdf',
'MDA_eDot_Symposium_-_Robotic_Capture.pdf.kx2zd5w.partial',
'Note_Ariane_NLP.ipynb',
'Note_Ariane_NLP_2.ipynb',
'Note_Ariane_NLP_3.ipynb',
'OHB_eDotWorkshop_ADRM.pdf',
'OHB_Sweden_eDotWorkshop_PRISMA_and_IRIDES.pdf',
'SKA_Polska_eDotWorkshop_Net_Simulator.pdf',
'TAS_Carole_Billot_edot.pdf',
'Test.ipynb',
'Text_clustering_v3_2.py',
'Webinar_OOSandADR_7May2020.pdf',
'__pycache__']
So what the hell is going on i'm out of ideas here.
Thx in advance
I think I have a possible answer to your question. Neither realpath nor abspath require their arguments to name existing files. In particular, the documentation for abspath() says: "On most platforms, this is equivalent to calling the function normpath() as follows: normpath(join(os.getcwd(), path))."
This means that if you have a Python script that has a line like,
foo = os.path.dirname(os.path.abspath("doesnotexist"))
then the value of foo will be the current working directory of the script. Since "doesnotexist" isn't the name of a file in this directory, it won't show up if you do os.listdir(foo).
I notice that you wrote that "N_PR_8705_004A_.doc" was in your "Downloads" directory, which is obviously not the same as 'C:\\Users\\f002722\\Stage NLP'. If 'C:\\Users\\f002722\\Stage NLP' is the working directory for your Python script, then running os.path.dirname(os.path.abspath("N_PR_8705_004A_.doc")) is just like writing os.path.dirname(os.path.abspath("doesnotexist")), for the reasons that I just gave.
Python can't automatically figure out the path of a file just by giving it a relative file name. For example, there could be many files named README.txt on a system, each in different directories, so there's no way for os.path.abspath('README.txt') to know which of those directories you want.
To move the file "N_PR_8705_004A_.doc" from the "Downloads" directory to 'C:\\Users\\f002722\\Stage NLP', you'd probably need to do something like this:
import shutil
shutil.move('C:\\Users\\f002722\\Downloads\\N_PR_8705_004A_.doc',
'C:\\Users\\f002722\\Stage NLP')
presuming, of course, that the "Downloads" directory was inside 'C:\\Users\\f002722'.

pdfkit changes href from relative to absolute paths on conversion

I'm using pdfkit to convert html files that have links with href attributes in them.
Inside of the html, href's are written with relative paths, e.g.:
PIC
When I convert this to pdf, the hrefs seem to be automatically rewritten to absolute paths (C:/Users/...).
Why does pdf change the href?
Wkhtmltopdf, which pdfkit relies on, converts relative links to absolute links by default.
This can be stopped by using the command line tool with a special flag:
wkhtmltopdf --keep-relative-links src destination
Or by telling pdfkit to apply this option:
def convert_to_pdf(path):
try:
# run the conversion and write the result to a file
config = pdfkit.configuration(wkhtmltopdf=path_wkthmltopdf)
options = {
'--keep-relative-links': ''
}
pdfkit.from_url(path+'.htm', path+'.pdf', configuration=config, options=options)
except Exception as why:
# report the error
sys.stderr.write('Pdf Conversion Error: {}\n'.format(why))
raise
Usually when you create a PDF out of an HTML file the PDF file will be opened on another location (for example on another computer after sending it via mail). So in order to reference correctly the full path is needed.
Of course this will only work if the other computer can access the path (so if the path is accessible from the other computer). With paths on C: this will only work from the localhost and not from other PCs.

Search for file names that contain words from a list and have a certain file extension

Beginner at python. I'm trying to search users folders for illegal content saved in folders. I want to find all files that contain either one or a number of words from the below list and also the files also have an extension that's listed.
I can search the files using file.endswith but don't know how to add in the word condition.
I've looked through the site and how only come across how to search for a certain word and not a list of words.
Thank you in advance
import os
L = ['720p','aac','ac3','bdrip','brrip','demonoid','disc','hdtv','dvdrip',
'edition','sample','torrent','www','x264','xvid']
for root, dirs, files in os.walk("Y:\User Folders\"):
for file in files:
if file.endswith(('*.7z','.3gp','.alb','.ape','.avi','.cbr','.cbz','.cue','.divx','.epub','.flac',
'.flv','.idx','.iso','.m2ts','.m2v','.m3u','.m4a','.m4b','.m4p','.m4v','.md5',
'.mkv','.mobi','.mov','.mp3','.mp4','.mpeg','.mpg','.mta','.nfo','.ogg','.ogm',
'.pla','.rar','.rm','.rmvb','.sfap0','.sfk','.sfv','.sls','.smfmf','.srt,''.sub',
'.torrent','.vob','.wav','.wma','.wmv','.wpl','.zip')):
print(os.path.join(root, file))
Perhaps it might be better to do a reverse search, and display a warning about files that DON'T match the file types you want. For instance you could do this:
if file.endswith(".txt", ".py"):
print("File is ok!")
else:
print("File is not ok!")
Using py.path.local from py package
The py package (install by $ pip install py) offers a very nice interface for working with files.
from py.path import local
def isbadname(path):
bad_extensions = [".pyc", "txt"]
bad_names = ["code", "xml"]
return (path.ext in bad_extensions) or (path.purebasename in bad_names)
for path in local(".").visit(isbadname):
print(path.strpath)
Explained:
Import
from py.path import local
py.path.local function creates "objectified" file names. To keep my code short, I import
it this way to use only local for objectifying file name strings.
Create objectified path to local directory:
local(".")
Created object is not a string, but an object, which has many interesting properties and methods.
Listing all files within some directory:
local(".").visit("*.txt")
returns a generator, providing all paths to files having extension ".txt"..
Alternative method to detect files to generate is providing a function, which gets argument path
(objectified file name) and returns True if the file is to be used, False otherwise.
The function isbadname serves exactly this purpose.
If you want to google for more information, use py path local (the name py is not giving good hits).
For more see https://py.readthedocs.io/en/latest/path.html
Note, that if you use pytest package, the py is installed with it (for good
reason - it makes tests related to file names much more readable and shorter).

Python - extract and modify a file path in all files in a directory in linux

I have files .sh files and .json files in which there are file paths given to point to a specific directory, but I should keep on changing the file path, depending on where my python scipt is run.
eg:content of one of my .sh file is
"cd /home/aswany/BotStudioInstallation/databricks/platform/databricksastro"
and I should change the file path via python code where the following path
"/home/aswany/BotStudioInstallation/" keep on changing depending on where databicks is located,
I tried the following code:
replaceAll(str(self.currentdirectory)+
"/databricks/platform/devsettings.json",
"/home/holmes/BotStudioInstallation",self.currentdirectory)
and function replaceAll is:
def replaceAll(self,file,searchExp,replaceExp):
for line in fileinput.input(file, inplace=1):
if searchExp in line:
line = line.replace(searchExp,replaceExp)
sys.stdout.write(line)
but above code only replaces a line
"home/holmes/BotStudioInstallation" to the current directory I am logged in,bt it cannot be sure that "home/holmes/BotStudioInstallation" is the only possibility it keep on changing like "home/aswany/BotStudioInstallation","home/dev3/BotStudioInstallation" etc ,I thought of regular expression for this.
please help me
Not sure I 100% understood your issue, but maybe I can help nonetheless.
As pointed out by J.F. Sebastian, you can use relative paths and remove the base part of the path. Using ./databricks/platform/devsettings.json might be enough. This is by far the most elegant solution.
If for any reason it is not, you can keep the directory you need to access, then append it to the base directory whenever you need it. That should allow you to deal with changes in the base directory. Though in the case the files will be used by other applications than your own, that might not be an option.
dir = get_dir_from_json()
dir_with_base = self.currentdirectory + dir
Alternatively, not an elegant solution though, without using regex you can use a "pattern" to always replace.
{
"directory": "<<_replace_me_>>/databricks/platform"
}
Then you know you can always replace "<<_replace_me_>>" with the base directory.

Passing commands to OS: What is wrong here?

So, I want to create a simple script to create directories based upon the file names contained within a certain folder.
My method looks like this:
def make_new_folders(filenames, destination):
"""
Take a list of presets and create new directories using mkdir
"""
for filename in filenames:
path = '"%s/%s/"' % (destination, filename)
subprocess.call(["mkdir", path])
For some reason I can't get the command to work.
If I pass in a file named "Test Folder", i get an error such as:
mkdir: "/Users/soundteam/Desktop/PlayGround/Test Folder: No such file or directory
Printing the 'path' variable results in:
"/Users/soundteam/Desktop/PlayGround/Test Folder/"
Can anyone point me in the right direction?
First of all, you should use os.path.join() to glue your path parts together because it works cross-platform.
Furthermore, there are built-in commands like os.mkdir or os.makedirs (which is really cool because it's recursive) to create folders. Creating a subprocess is expensive and, in this case, not a good idea.
In your example you're passing double-quotes ("destination/filename") to subprocess, which you don't have to do. Terminals need double-quotes if you use whitespaces in file or folder names, subprocess takes care of that for you.
You don't need the double quotes. subprocess passes the parameters directly to the process, so you don't need to prepare them for parsing by a shell. You also don't need the trailing slash, and should use os.path.join to combine path components:
path = os.path.join(destination, filename)
EDIT: You should accept #Fabian's answer, which explains that you don't need subprocess at all (I knew that).

Categories

Resources