I'm no sure what the right terminology to use for this ask is, but i want to create a python output that will look something like this:
With the intent of having some Data extracted from specific cells of a dataframe fill in the TEXT section of the image. Obviously i want the data to not be hard-coded so i can change it as per the topic. What packages even do this? I've never explored the "visual" component side of python before.
Related
I'm having trouble finding a solution to fill out an excel template using python. I currently have a pandas dataframe where I use openpyxl to write the necessary data to specific Rows and Cells in a for loop. The issue I have is that in my next project several of the cells I have to write are not continuous so for example instead of going A1,A2,A3 it can go A1,A5,A9. However this time if I were to list the cells like I did in the past it would be impractical.
So I was looking for something that would work similar to a Vlookup in excel. Where in the template we have Python would match the necessary Row and Column to drop the information. I know I might need to use different commands.
I added a picture below as an example. So I would need to drop values in the empty cells and ideally Python would read "USA and Revenue" and know to drop that information on cell B2. I know I might need something to map it also I am just not sure on how to start or if it is even possible.
enter image description here
I have csv file in excel that looks like this (sorry cant place pictures in the post yet)
RAW DATA
Here is what i want to do:
1) I want python to read through column B and find the phrase RCOM (highlighted)
2) Once it find that phrase, i want it to show me the date entry and the corresponding amounts which i have made bold and are in the red color.
3) hopefully making it read something like this:
30-08-2018 273585.8
27-09-2018 275701.4
25-10-2018 276780
*If possible putting the entries on seperate lines would be great, but if not thats fine too.
4) I will then store these in a variable of my choice and print it out as needed.
I know the column where the word RCOM is located, and i know the column where the amounts i want are located (B and K respectively)
I am very new to coding, any help will be appreciated. Im just trying to automate the boring stuff :)
Thanks
you can generate a data frame using read_csv function from pandas library. Once you have the data in data frame format, you can reach to data mentioned in your question by filtering the data according your requirements. I know this answer is very generic and does not provide a code suggestion but I believe that all information you need can be found in following page https://pandas.pydata.org/pandas-docs/stable/10min.html
For importing data Getting Data In/Out section will be helpful and for filtering (masking) the data Selection section will help.
I have lot of PDF, DOC[X], TIFF and others files (scans from a shared folder). Each file converted into pack of text files: one text file per page.
Each pack of files could contain multiple documents (for example thee contracts). Document kind could be not only contract.
During the processing the pack of the files I don't know what kind of the documents current pack contains and it's possible that one pack contains multiple document kinds (contracts, invoices, etc).
I'm looking for some possible approaches to solve this programmatically.
I'm tried to search something like that but without any success.
UPD: I tried to create binary classificator with scikit-learn and now looking for another solution.
This at its basis, being they are "scans" sounds more like something that could be approached with computer vision, however this is currently far far above my current level of programming.
E.g. projects like SimpleCV may be a good starting point,
http://www.simplecv.org/
Or possibly you could get away with OCR reading the "scans" and working based on the contents. pytesseract seems popular for this type of task,
https://pypi.org/project/pytesseract/
However that still lacks defining how you would tell your program that this part of the image means that this is 3 separate contracts, Is there anything about these files in particular that make this clear, e.g. "1 of 3" on the pages,, a logo or otherwise? that will be the main part that determines how complex a problem you are trying to solve.
Best solution was to create binary classifier (SGDClassifier) and train it on classes first-page and not-first-page. Each item from the dataset was trimmed to 100 tokens (words)
I try to gather some graphics and text from different folders and present them in a comprehensive way. For this I use python to copy them into one folder and derive a dynamic LaTeX presentation, where I plot the copied graphics and print the text. The problem I'm facing now is, that I can derive the title for a slide dynamically from a text file, but if it's too long it will obviously wrap around. This dynamic title can be pretty long, so it might fill the whole slide. What I'd like to do now is to limit the space used by this text, without losing its information. The non-elegant solution I have to this problem is to count the characters and if it's over a certain threshold, use a smaller font. This solution is tedious and not optimal, I'd love to hear a better idea.
I want to create a table with Python that looks like a simple excel table, therefore I have already used the pyExcelerator. But now I thought about just using pyplot.table which seems to be very easy. However, I need to make some changes and I don't know if this is possible in vpyplot.table`.
For example I want to add a cell in the upper left corner and I also want to make two cells beneath the cell t+1 (see the table example below).
So, is it possible to do these changes in pyplot.table or should I better use another way to make tables?
Building a program to generate an table in a image for inclusion into your word document is a bit overkill. Its a lot of added work and completely unnecessary effort. Make the table in Excel and then paste it into Word. It'll look good and will be easier to update and change.
If you are using this as an excuse to learn something new, that is all well and good, but you need to give us more to help you with. SO isn't a code factory. Offer up what you have tried, and samples of code you are having trouble with. We can help with that.