I have a large number of auto generated code files that are identifiable by the having _pb2 in the file name.
When I search using PyCharm CTRL+Shift+F I can use a file mask. I would like for instance to find all Python files *.py that do not have _pb2 in their name. Is there a way to achieve that?
You can include and exclude files and directories by creating a Custom Scope that filters using a combination of filename wildcards.
Ctrl+Shift+F to open "Find in Path".
Create a new Custom Scope following steps 2-4 in the screenshot.
Enter the pattern, for your specification it would be file[Project_Name]:*.py&&!file:*_pb2*
Afterwards the search results are restricted to within the Custom Scope.
Source at JetBrains official site: "Scope configuration controls"
Related
I need to read a specific set of text files using Spark:
df = spark.text.read(paths)
The problem is that Spark treats the path as either a directory or a file. This has two problems:
If the file does not exist, it's skipped with just a warning in the log – "directory does not exist; was it deleted recently?"
Trying to list the path as a directory takes additional processing time.
From the documentation, Spark uses partition discovery by default (which means it must try to list the path as a directory), but it can also operate in a different mode – enabled by the recursiveFileLookup option.
What I want is a third mode where it just reads each of the provided paths as an exact match file. Is there a way to achieve this without a change in Spark's read method?
I have a directory containing about ~ 1,00,000 multipage PDFs.
I want to separate Corrupt/Unreadable and Password protected PDFs from this directory using python.
Need a good and fast solution as I might need to do it for large number of files in future.
Thanks in advance.
You can try to use PyPDF2. Loop over all files in the directory using os.listdir() and try opening each one, and store the name of each one that gives you an error. You can also place them in two different directories depending on whether opening a file gives you an error using simple try/except.
When I have a few test cases, generates the output folder files to me:
For testing I use robotframework and pycharm
log.html
output.xml
report.xml
After each test, the files are overwritten.
Is there a possibility that the names of these files after the match in my tests, so I do not have to change their names or create separate folders for each test - for example:
log_test1.html
output_test1.xml
report_test1.xml
Whether to use some parameters that will take me the name of the test and passed it on to the name of the output file?
Please help how can I set this up using pycharm
Regards,
All Robot Framework output files can be automatically timestamped with the option --timestampoutputs:
pybot --timestampoutputs tests.html
See User Guide section "Timestamping output files"
Alright, I have two folders that need to be in sync, but certain files need to be ignored before the first upload.
To make sense of what I mean lets say for example I have a folder called src and another folder called dest.
src contains settings.properties, some python code, and a template properties file.
dest contains the same settings.properties, same python code but the template properties file is populate during the sync process (done by a script that wraps the protocol)
Now, if I decide to modify the python code in dest, the python code should be updated in src folder, but the new template.properties which is populated should be ignored.
I tried using excludes and includes but I read that you can't use both because "includes takes precedence"
Using Windows, and I am currently using a python script that formats the paths to the default "/cygdrive/C/" then I populate the properties file, then I run rsync
I want to change the default directory listing of the pythonwebkit(the one imported from gi.repository) for an application I am working on. Is there any function/script in webkit that does the job?
EDIT
The code for styling the default directory listing is in the file net/base/dir_header.html and ends up in chrome.pak and chrome_100_percent.pak.
The python module data_pack.py can work with these files.
If you want to filter certain file types from the list, you can probably do that in addRow()
You will have to use os.chdir() to change the current directory for the whole process. AFAIK, WebKit doesn't keep an internal environment for things like the current folder.