How to write unit tests for text parser? [closed] - python

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
For background, I am somewhat of a self-taught Python developer with only some formal training with a few CS courses in school.
In my job right now, I am working on a Python program that will automatically parse information from a very large text file (thousands of lines) that's a output result of a simulation software. I would like to be doing test driven development (TDD) but I am having a hard time understanding how to write proper unit tests.
My trouble is, the output of some of my functions (units) are massive data structures that are parsed versions of the text file. I could go through and create those outputs manually and then test but it would take a lot of time. The whole point of a parser is to save time and create structured outputs. Only testing I've been doing so far is trial and error manually which is also cumbersome.
So my question is, are there more intuitive ways to create tests for parsers?
Thank you in advance for any help!

Usually parsers are tested using a regression testing system. You create sample input sets and verify that the output is correct. Then you put the input and output in libraries. Each time you modify the code, you run the regression test system over the library to see if anything changes.

Related

Transfer data from c++ to python in real time [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I would like to ask a question about transferring data from c++ to python in real time.
My situation is :
1) I am generating data every 1 ms in c++,
2) I would like to stack data for certain amount of time and make a dataset,
3) I would like to run some machine learning algorithm written in Python without turning of c++ program.
So far, I have thought about several things :
option 1 ) Save the dataset as a txt file and read it from python. But this seems too slow due to I/O process.
option 2 ) Use IPC such as zeromq. I am quite new to IPC, so I am not sure if it is the thing I really want. Also, among the multiple methods (mmap, shared memory, message queue, ...), I do not know which one is the best shot for me.
option 3 ) Use UDP. From my understanding UDP sometimes sends the same data twice, or skips the data, or mixes data up (e.g. previous time step data arrives later)
Is there any recommendations I need to search and study?

Deploying python scripts [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am a python beginner and I am a little experienced in OO-programming in Java and PHP and also fucntional programming in R . Thus, my question is considering the general usage of python scripts in everyday use-cases.
I want to "learn" how to think/approach a problem that I do experience when facing a situation with my software where a "script" could help me out or improve something.
For instance, I've heard friends talking about their self-made python scripts to evenly mute the audio of movies to avoid loud outliers in explosive scenes, etc. Another example, in my case righ tnow, is to filter out certain pictures with no GPS-time meta information for the timezone in order to sort these fotos in accordance with the others.
I really want to get the essence and recipe based on the aforementioned examples to better integrate Python in my everyday life and get an intuitive feeling for it. (i.e. how would a simple script look like that takes a picture, filters out its meta data, and does something -> where do I have to run the script so I can call the function with these .JPG files as its arguments?).
I would also be glad if some of you could recommend some practical tutorials or literature.
Thank you in advance :)
P.S. I know it is not a concrete question but rather it is intended to get a glimpse on a wide field of usage and thinking - but I want to get this essential take away that motivates me and shows me the direction.

Publish Python code for data analysis and reference it in PhD Thesis [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have a general question concerning "publishing" python code and referencing it later in my own PhD thesis. I hope someone can provide helpful thoughts about it.
My plan:
During my PhD time I have written several code snippets for time-frequency analysis. These are not large code projects, but snippets that provide functionalities, which are not included in the general scipy.signal package. In approximately 6 month I will hand in the thesis, so now I am thinking about what stuff to include in the thesis. If I include these snippets in my thesis I somehow thought it would be "cooler" to have them already "published" in any form instead of just putting the code in the appendix of the thesis. By doing so I might be able to write something like this in my thesis: The code for analysing the data_x_y is also available at ...
I would like to find the easiest way to accomplish this.
Thanks for any comments!
Publish the code on Github. You can optionally create a Python package, and publish that to PyPI.
Once it's on GitHub, you can get a free DOI for it using Zenodo. This will create a permanent record (including source code), and makes your code easily citable (both by yourself and others).

Python - First Interface with a Program [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I have spent the last six months learning python as a way to automate my working environment. So far I have automated data extraction and report downloading from various web-based sources through the use of webcrawlers, interacted with excel files, created visual representations of data through matplotlib, and removed almost all the monotony from bank reconciliation.
I now come to a new task which takes up a large amount of my daily workload. We use an accounts program called Sage 50 Accounts. I effectively want to begin to learn how to manipulate the data contained within this program so that my daily thought patterns can be put into Python code.
Because this hasn't been done, there's no pre-made API. So my question is:
When wishing to interact with a new program through Python, how does a programmer begin such an inquiry?
Please accept that this question is only vague and general because I'm incredibly new to such a task.
SData is Sage's general data access API layer and should suit your purposes.
Otherwise you might need to invest in or obtain a Sage Development SDK.

parsing C code using python [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have a huge C file (~100k lines) which I need to be able to parse. Mainly I need to be able to get details about individual fields of every structure (like field name and type for every field in the structure) from its definition. Is there a good(open source, which i can use in my code) way to do this already? Or should I write my own parser for this. If I have to write my own, can anyone suggest a good place to start? I have never worked with python before.
Thanks
Take a look at this link for an extensive list of parsing tools available for Python. Specifically, for parsing c code, try the pycparser
The right way to do this is almost certainly to interface with the front-end of an existing compiler, such as gcc, then work with the intermediate representation, rather than attempting to create your own parser, in any language.
However, pycparser, as suggested by Dhara might well be a good substitute, and definitely better than any attempt to roll your own.

Categories

Resources