I have some information stored in MySQL and I want to fetch certain data. I have a column that contains "File_names" and these names correspond with all kinds of files like images, scripts, txts...Example. There's a lot of script names of different programming languages like cpp, PHP, sh, py,...
I only need the script related data. In my head is clear: fetch data only if the file name corresponds with a script (we know this because of its extension). But I don't know how to translate this idea to a MySQL query. I'm also thinking about fetching all info from MySQL and then filter it using python but I still have no idea.
I have a python solution in my mind but I think it's too complex. For example, if I create a list with a lot of script extensions, fetch all info from MySQL, split all file_names obtaining the extension, and finally compare it with the script extensions list. I think there's an easier/efficient way to do this.
Any idea?
Thanks and best regards
A simple query:
SELECT File_names
FROM Your_table
WHERE (File_names like '%.php'
OR File_names like '%.cpp'
OR File_names like '%.sh')
Add more OR options according to your needs.
Related
I have 1000's of scanned field books as PDF. Each has a unique filename. In a spreadsheet I have metadata for each, where each row has:
index number, filename, info1, info2, info3, info4, etc.
filename is the exact file name of the PDF. info1 is just an example of a metadata field, such as 'Year' or whatever. There are only about 8 fields or so, not ever PDF is relevant to all of them.
I assume there should be a reasonable way to create a database, mysql, or other, reading the spreadsheet (which I can just saves as .csv or .txt or something). This part I am sure I can handle.
I want to be able to lookup/search for a pdf file based on entering in various search items based on the metadata, and get a list of results. In a web interface, or a custom window, and be able to click on the results and open the file. Basically a typical search window with predefined fields you can enter and get results - like at an old school library terminal.
I have decent coding skills in python, mostly math, but some file skills as well. Looking for guidance on what tools and approach I should take to this. My short term goal is to be able to query and find files and open whatever results. Long term want to be able to share this with the public so they can search and find stuff.
After trying to figure out what to search for online, I am obviously at a loss. How do you suggest I do this and what tools or libraries should I use. I cannot find an example of this online. Not sure how to word it.
The actual data stuff could be done with Pandas:
read the excel file into Pandas
perform the search on the Pandas dataframe, e.g. using df.query()
But this does not give you a GUI. For that you could go for a web app, using Flask or Django framework. That, however, one does not master over night :)
This is a good course to learn that kind of stuff: https://www.edx.org/course/cs50s-web-programming-with-python-and-javascript?index=product&queryID=01efddd992de28a8b1b27d136111a2a8&position=3
I want to ask in general, for right path that i should go to work with data downloaded from oracle database, to python via cx_Oracle.
Right now i have created oracle_dataframe with content from database, its hughe.
I will use in code for example: for file in files: and compare data from each file with this oracle_dataframe and search what i need.
As we know it's not efficient to hold oracle content in variable, and downloading it all the time, over and over.
So my thoughts are, maybe i should create something like python database where i can place oracle content and only making updates from oracle ?
Someone here did something like that before? i heard that i can use mongo db , but i want to ask for your approach .
Cheers
I have a Microsoft Access database, and pictures that have file names matching entries in the database (though part of the file name is irrelevant). I need to read the file names and insert links to them into the database attached to their correct entries. I've been playing around with PowerShell and Python for a while, but have little experience with Access and can't find a lot of documentation on the subject. So my questions are:
Am I better off using PowerShell, Python, or something else for this project? I just happen to have experience with those two languages so they're my preferred jumping off point.
I imagine this will take a good amount of work and I don't mind getting my hands dirty/doing a lot of research, but after looking around I can't seem to find a good place to get started. Are there any specific commands, documentation, functions, etc. that could give me a jumpstart on this project?
Thanks!
EDIT: Thanks to #ako for bringing up a good point and something I was concerned about. Putting the photos in the DB itself is likely a bad idea, so I'd like to instead host them elsewhere then automatically have links to the files generated in the DB based on the file names and matching DB entries.
I am very new to python and want to make a script for a spreadsheet I use at work. Basically, i need to associate an address with multiple 5 digit reference codes. There are multiple addresses with a corresponding group of reference codes.
i.e:
Address:
1234 E. 32nd Street,
New York, NY, 10001
Ref #'s
RL081
RL089
LA063
Address 2:
etc....
I need my script to look up a location by ref code. This information is then used to build a new spreadsheet (each row needs an address and the address is looked up using a ref code). What is the best way to use this info in python? Would it be a dictionary? Should I put the addresses / ref codes into an XML type file?
Thanks
Edit (clarification):
Basically, I have those addresses and corresponding ref codes (they could be in a plain text document, I could organize them in a spreadsheet, or whatever so python can use them). The script I'm building needs to use those ref codes to enter an address into a new spreadsheet. Basically, I input a half complete spreadsheet and the script fills in the addresses based on the ref code in each row.
Import into what?
If you have everything in a spreadsheet, Python has a very good CSV reader library. Once you've read it in, the challenge becomes what to do with it.
If you are looking at a medium term solution, I'd recommend looking at using SQLite to set up a simple spreadsheet that can manage the information in a more structured way. SQLite scales well in the beginning stages of a project and it becomes a trivial case to insert into a fully-fledged RDBMS like PostGreSQL or MySQL if it becomes neccessary.
From there it becomes a case of writing the libraries you need to manipulate your data, and present it. In the initial stages this can be done using the command line but by using an SQL database this can be exposed through a webpage for multiple people down the line without worrying about managing data integrity.
I prefer to use JSON over XML for storing data that will later be used in python. The json module is fairly robust and easy to use. Since you will be performing lookups I would definitely loading the information as a python dictionary. Since you'll be querying by ref codes you'll want to use those for keys and have the address as the value.
I need my script to look up a location by ref code
Since this is the only requirement you've stated, I would recommend using a dict where keys are ref codes and values are addresses.
I'm not sure why you are asking about "file types". It seems you already have all this information stored in a spreadsheet - no need to write a new file.
How can I convert a .csv file into .dbf file using a python script? I found this piece of code online but I'm not certain how reliable it is. Are there any modules out there that have this functionality?
Using the dbf package you can get a basic csv file with code similar to this:
import dbf
some_table = dbf.from_csv(csvfile='/path/to/file.csv', to_disk=True)
This will create table with the same name and either Character or Memo fields and field names of f0, f1, f2, etc.
For a different filename use the filenameparameter, and if you know your field names you can also use the field_names parameter.
some_table = dbf.from_csv(csvfile='data.csv', filename='mytable',
field_names='name age birth'.split())
Rather basic documentation is available here.
Disclosure: I am the author of this package.
You won't find anything on the net that reads a CSV file and writes a DBF file such that you can just invoke it and supply 2 file-paths. For each DBF field you need to specify the type, size, and (if relevant) number of decimal places.
Some questions:
What software is going to consume the output DBF file?
There is no such thing as "the" (one and only) DBF file format. Do you need dBase III ? dBase 4? 7? Visual FoxPro? etc?
What is the maximum length of text field that you need to write? Do you have non-ASCII text?
Which version of Python?
If your requirements are minimal (dBase III format, no non-ASCII text, text <= 254 bytes long, Python 2.X), then the cookbook recipe that you quoted should do the job.
Use the csv library to read your data from the csv file. The third-party dbf library can write a dbf file for you.
Edit: Originally, I listed dbfpy, but the library above seems to be more actively updated.
None that are well-polished, to my knowledge. I have had to work with xBase files many times over the years, and I keep finding myself writing code to do it when I have to do it. I have, somewhere in one of my backups, a pretty functional, pure-Python library to do it, but I don't know precisely where that is.
Fortunately, the xBase file format isn't all that complex. You can find the specification on the Internet, of course. At a glance the module that you linked to looks fine, but of course make copies of any data that you are working with before using it.
A solid, read/write, fully functional xBase library with all the bells and whistles is something that has been on my TODO list for a while... I might even get to it in what is left this year, if I'm lucky... (probably not, though, sadly).
I have created a python script here. It should be customizable for any csv layout. You do need to know your DBF data structure before this will be possible. This script requires two csv files, one for your DBF header setup and one for your body data. good luck.
https://github.com/mikebrennan/csv2dbf_python