Writing and Reading from CSV - Python

Writing and Reading from CSV - Python - python

I am working on an IoT project, where a sensor sends data and is stored in a CSV file. A python program then reads this file every 30 seconds and reads 300 rows and do some processing. So, at 30 second, rows 1-300 will be read and at 60 second, rows 300-600 will be read.
At the same time data is getting stored into this as well.
So is it possible in some way to implement this in python where one program is writing data into a file while another is reading from the file.

Related

How to open and edit a CSV file with 2 different programs?

I am building a bot with python, and this bot saves data into CSV file, so every time the python open the file, edit, save.
The thing is - I am connected to that CSV using VBA (through SQL), and every X seconds the VBA reads the CSV and calculates some things from that CSV file.
The problem comes when there is this moment where the python is writing to that CSV and the VBA wants to read from that CSV on the same time or vice versa, if the VBA reads from the csv and then python is crashing due to permission error.
What are the solution for those kind of things? that more than 1 program needs to access on the same time the CSV file?

Converting load cell data to .txt file

I am trying to gather data from my load cell that is hooked up to a raspberry pi 3B. I have the script reading data but am trying to figure out how to export that data into a .txt file.
I have seen some info on how to create .txt files with text but not much for integers. I need something more fit to my application.
The script takes x (not sure on the exact value) samples of data per second so the amount of data can vary depending on how long I run the script. I want a text file to be created once I stop the script and record the data points on separate lines like in the attached image. I found a way to do it with words/letters but integers wouldn't work.
Let me know if there is anything I can share to help find a solution easier. I appreciate all the input.

In Python, you can use "open" with the "w" mode to create a new file:
file = open("load_cell_data.txt", "w")
Pass the data in with file.write and then close the file with file.close.
Docs: https://docs.python.org/3/library/functions.html#open

What is the quickest way to write lots of data to TXT file in Python?

o, I'm extracting lots of data from an OpenVMS RDB Oracle database with a Python script.
On "hand-made" profiling, 3/4 of the time is taken writing down the data to the TXT file.
print >> outputFile, T00_EXTPLANTRA_STRUCTS.setStruct(parFormat).build(hyperContainer)
This is the specific line that prints out the data, which takes 3/4 of the execution time.
T00_EXTPLANTRA_STRUCTS.py is an external file containing the data structures (Which .setStruct() defines), and the hyperContainer is a Container from "Construct" library that contains data.
It takes almost five minutes to extract the whole file. I'd really like to learn if there is a way to write TXT data faster than this.
I already optimized the rest of the code especially DB transactions, it's just the writing operation that's taking a long time to execute.
The data to write looks like this, with 167.000 lines of this kind. (I hid the actual data with "X")
XX;XXXXX;X;>;XXXXX;XXXX;XXXXXXXXXXXXXX ;XXX; ;XXX; ;
Lots of thanks for anyone spending any time on this.

Getting realtime weather data charted using python and google apps script

Apologies if too stupid question:
I run the weather data logging program Cumulus on my pc. It writes each data sample to the realtime.txt file every second (the previous sample is always overwritten i.e. file size stays at 1 data row).
I want to publish a pseudo-realtime chart of the last hour of wind data (strength and direction) to the web. My plan is to have a Python program reading the realtime.txt data rows each second and append them to a buffer which is saved to a file. A Google App script is then time-triggered every minute, reads the buffer file and appends the last minute's worth of data to a circular one-hour data table in Google Sheets. The table's data is used in a line chart, which is then published and can be read by anyone.
So basically I have a short term (1-minute) FIFO which feeds a longer term (1-hour) FIFO using a buffer file for transfer.
Now my problem is how to sync these FIFOs so that every one-second data row is counted, and no rows are counted twice. As the two apps are unsynchronized, this is not trivial at least not for me.
I'm thinking of giving each data row a unique row number, enabling the App script to figure out where to put the next chunk of data in the longer term buffer. Also to expand the short term FIFO to 2 minutes, which would ensure that there is always at least 1-minutes worth of fresh data available regardless of any jitter in the trigger time of the App script.
But converting these thoughts into code is proving a bit overwhelming for me ;( Anyone had the same problem or can otherwise advise, even slightly?

Spark speed performance

I have program that I have for single computer (in Python) exection and also implemented the same for Spark. This program basically only reads .json from which it takes one field and saves it back. Using Spark my program runs aproximately 100 times slower on 1 master and 1 slave then the single node standard Python program for this (of course I'm reading from file and saving to file there). So I would like to ask where possibly might be the problem?
My Spark program looks like:
sc = SparkContext(appName="Json data preprocessor")
distData = sc.textFile(sys.argv[2])
json_extractor = JsonExtractor(sys.argv[1])
cleanedData = distData.flatMap(json_extractor.extract_json)
cleanedData.saveAsTextFile(sys.argv[3])
JsonExtractor only selects the data from field that is given by sys.argv[1].
My data are basically many small one line files, where this line is always json.
I have tried both, reading and writing the data from/to Amazon S3 and local disc on all the machines.
I would like to ask if there is something that I might be missing or if Spark is supposed to be so slow in comparison with the local non paralleled single node program.

As it was advised to me at the Spark mailing list the problem was in the lot of very small json files.
Performance can be much improved either by merging small files to one bigger or by:
sc = SparkContext(appName="Json data preprocessor")
distData = sc.textFile(sys.argv[2]).coalesce(10) #10 partition tasks
json_extractor = JsonExtractor(sys.argv[1])
cleanedData = distData.flatMap(json_extractor.extract_json)
cleanedData.saveAsTextFile(sys.argv[3])

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.