I would like to create a structure in Python which represents a Simulink model. I am aware of at least two ways of doing this - by parsing an ".mdl" file, or by using Matlab's api for communicating with the model.
Can you recommend good libraries or APIs for doing this?
In particular, I need to perform some processing on a Simulink model and I would like to do it in Python. Also I don't want to be constantly communicating with Matlab for doing this (so that I can release the floating license).
I have seen some parsers online, but they seem to be a little limited, usually not supporting components such as Bus Creators and Bus Selectors, Muxes, Demuxes, and reading UserData information.
Any help will be greatly appreciated.
Not my area, but noticed this Python parser which may be helpful.
Or you can purchase the Simulink Report Generator in order to save/manipulate the model as a XML file.
Or the *.mdl file is a readable ascii file. You could read it into a string with a fread statement, alter the string, then either save it to your format of choice or write it back out to a *.mdl file. My coworker thought of this, not me! But would require doing the editing/parsing with a routine you write yourself.
Related
At work, we already have a custom script to perform regressions for us.
I have been tasked with the creation of a wrapper script which will ultimately invoke our pre existing script in a custom fashion and do a bunch of house keeping in and around the executed tests.
Essentially what I need to provide are:
Ability to invoke simulation using custom switches, paths etc
Provide some form of regression tracking (history of pass fail)
Some form of basic web interface to view regression results
Perhaps an automated email if something breaks
Now nothing I have mentioned is new and I am sure there are a lot of ways to approach this.
What I am looking for is some suggestions as to a good way to go about this.
Some solutions that come to mind:
Build custom python script (most effort)
Extend python's built in Unit Testing framework. Subclass where necessary.
This is really the crux of my question. Is this a good solution?
Use some other framework?
Jenkins? I have not used jenkins but I've heard good things about this tool. Any thoughts on if this tool would suit here?
Thanks for your help!
Some of you may ask why I don't extend the base simulation script directly.
Well, it's a few thousand line Perl monster!
I don't speak Perl, nor do I have any intention to start!
I am working with some network simulator. After making some extensions to it, I need to make a lot of different simulations and tests. I need to record:
simulation scenario configurations
values of some parameters (e.g. buffer sizes, signal qualities, position) per devices per time unit t
final results computed from those recorded values
Second data is needed to perform some visualization after simulation was performed (simple animation, showing some statistics over time).
I am using Python with matplotlib etc. for post-processing the data and for writing a proper app (now considering pyQt or Django, but this is not the topic of the question). Now I am wondering what would be the best way to store this data?
My first guess was to use XML files, but it can be too much overhead from the XML syntax (I mean, files can grow up to very big sizes, especially for the second part of the data type). So I tried to design a database... But this also seems to me to be not the proper way... Maybe a mix of both?
I have tried to find some clues in Google, but found nothing special. Have you ever had a need for storing such data? How have you done that? Is there any "design pattern" for that?
Separate concerns:
Apart from pondering on the technology to use for storing data (DBMS, CSV, or maybe one of the specific formats for scientific data), note that you have three very different kinds of data to manage:
Simulation scenario configurations: these are (typically) rather small, but they need to be simple to edit, simple to re-use, and should allow to reproduce a simulation run. Here, text or code files seem to be a good choice (these should also be version-controlled).
Raw simulation data: this is where you should be really careful if you are concerned with simulation performance, because writing 3 GB of data during a run can take a huge amount of time if implemented badly. One way to proceed would be to use existing file formats for this purpose (see below) and see if they work for you. If not, you can still use a DBMS. Also, it is usually a good idea to include a description of the scenario that generated the data (or at least a reference), as this helps you managing the results.
Data for post-processing: how to store this mostly depends on the post-processing tools. For example, if you already have a class structure for your visualization application, you could define a file format that makes it easy to read in the required data.
Look for existing solutions:
The problem you face (How to manage simulation data?) is fundamental and there are many potential solutions, each coming with certain trade-offs. As you are working in network simulation, check out what capabilities other tools used in your community provide. It could be that their developers ran into problems you are not even anticipating yet (regarding reproducibility etc.), and already found a good solution. For example, you could check out how OMNeT++ is handling simulation output: the simulation configurations are defined in a separate file, results are written to vec and sca files (depending on their nature). As far as I understood your problems with hierarchical data, this is supported as well (vectors get unique IDs and are associated with an attribute of some model entity).
Additional tools already work with these file formats, e.g. to convert them to other formats like CSV/MATLAB files, so you could even think of creating files in the same format (documented here) and to use existing tools/converters for post-processing.
Many other simulation tools will have similar features, so take a look at what would work best for you.
It sounds like you need to record more or less the same kinds of information for each case, so a relational database sounds like a good fit-- why do you think it's "not the proper way"?
If your data fits in a collection of CSV files, you're most of the way to a relational database already! Just store in database tables instead, and you have support for foreign keys and queries. If you go on to implement an object-oriented solution, you can initialize your objects from the database.
If your data structures are well-known and stable AND you need some of the SQL querying / computation features then a light-weight relational DB like SQLite might be the way to go (just make sure it can handle your eventual 3+GB data).
Else - ie, each simulation scenario might need a dedicated data structure to store the results -, and you don't need any SQL feature, then you might be better using a more free-form solution (document-oriented database, OO database, filesystem + csv, whatever).
Note that you can still use a SQL db in the second case, but you'll have to dynamically create tables for each resultset, and of course dynamically create the relevant SQL queries too.
I'm starting to learn about doing data analysis in Python.
In R, you can load data into memory, then save variables into a .rdata file.
I'm trying to create an analysis "project", so I can load the data, store the scripts, then save the output so I can recall it should I need to.
Is there an equivalent function in Python?
Thanks
What you're looking for is binary serialization. The most notable functionality for this in Python is pickle. If you have some standard scientific data structures, you could look at HDF5 instead. JSON works for a lot of objects as well, but it is not binary serialization - it is text-based.
If you expand your options, there are a lot of other serialization options, too. Such as Google's Protocol Buffers (the developer of Rprotobuf is the top-ranked answerer for the r tag on SO), Avro, Thrift, and more.
Although there are generic serialization options, such as pickle and .Rdat, careful consideration of your usage will be helpful in making I/O fast and appropriate to your needs, especially if you need random access, portability, parallel access, tool re-use, etc. For instance, I now tend to avoid .Rdat for large objects.
json
pickle
I'm working on an academic project aimed at studying people behavior.
The project will be divided in three parts:
A program to read the data from some remote sources, and build a local data pool with it.
A program to validate this data pool, and to keep it coherent
A web interface to allow people to read/manipulate the data.
The data consists of a list of people, all with an ID #, and with several characteristics: height, weight, age, ...
I need to easily make groups out of this data (e.g.: all with a given age, or a range of heights) and the data is several TB big (but can reduced in smaller subsets of 2-3 gb).
I have a strong background on the theoretical stuff behind the project, but I'm not a computer scientist. I know java, C and Matlab, and now I'm learning python.
I would like to use python since it seems easy enough and greatly reduce the verbosity of Java. The problem is that I'm wondering how to handle the data pool.
I'm no expert of databases but I guess I need one here. What tools do you think I should use?
Remember that the aim is to implement very advanced mathematical functions on sets of data, thus we want to reduce complexity of source code. Speed is not an issue.
Sounds that the main functionality needed can be found from:
pytables
and
scipy/numpy
Go with a NoSQL database like MongoDB which is much easier to handle data in such a case than having to learn SQL.
Since you aren't an expert I recommend you to use mysql database as the backend of storing your data, it's easy to learn and you'll have a capability to query your data using SQL and write your data using python see this MySQL Guide Python-Mysql
I was just looking through some information about Google's protocol buffers data interchange format. Has anyone played around with the code or even created a project around it?
I'm currently using XML in a Python project for structured content created by hand in a text editor, and I was wondering what the general opinion was on Protocol Buffers as a user-facing input format. The speed and brevity benefits definitely seem to be there, but there are so many factors when it comes to actually generating and processing the data.
If you are looking for user facing interaction, stick with xml. It has more support, understanding, and general acceptance currently. If it's internal, I would say that protocol buffers are a great idea.
Maybe in a few years as more tools come out to support protocol buffers, then start looking towards that for a public facing api. Until then... JSON?
Protocol buffers are intended to optimize communications between machines. They are really not intended for human interaction. Also, the format is binary, so it could not replace XML in that use case.
I would also recommend JSON as being the most compact text-based format.
Another drawback of binary format like PB is that if there is a single bit of error, the entire data file is not parsable, but with JSON or XML, as the last resort you can still manually fix the error because it is human readable and has redundancy built-in..
From your brief description, it sounds like protocol buffers is not the right fit. The phrase "structured content created by hand in a text editor" pretty much screams for XML.
But if you want efficient, low latency communications with data structures that are not shared outside your organization, binary serialization such as protocol buffers can offer a huge win.