Disable Cache/Buffer on Specific File (Linux)

Disable Cache/Buffer on Specific File (Linux) - python

I am currently working in a Yocto Linux build and am trying to interface with a hardware block on an FPGA. This block is imitating an SD card with a FAT16 file system on it; containing a single file (cam.raw). This file represents the shared memory space between the FPGA and the linux system. As such, I want to be able to write data from the linux system to this memory and get back any changes the FPGA might make (Currently, the FPGA simply takes part of the data from the memory space and adds 6 to the LSB of a 32-bit word, like I write 0x40302010 and should get back 0x40302016 if I read back the data). However, due to some caching somewhere, while I can write the data to the FPGA, I cannot immediately get back the result.
I am currently doing something like this (using python because its easy):
% mount /dev/mmcblk1 /memstick
% python
>> import mmap
>> import os
>> f = os.open("/memstick/cam.raw", os.O_RDWR | os.O_DIRECT)
>> m = mmap.mmap(f, 0)
>> for i in xrange(1024):
... m[i] = chr(i % 256)
...
>> m.flush() # Make sure data goes from linux to FPGA
>> hex(ord(m[0])) # Should be 0x6
'0x0'
I can confirm with dd that the data is changed (though I frequently run into buffering issues with that too) and using the tools for the FPGA (SignalTap/ChipScope) that I am indeed getting correct answer (ie, the first 32-bit word in this case is 0x03020106). However, someone, whether its python or linux or both are buffering the file and not reading from the "SD card" (FPGA) again and storing the file data in memory. I need to shut this completely off so all reads result in reads from the FPGA; but Im not sure where the buffering is taking place or how to do that.
Any insight would be appreciated! (Note, I can use mmap.flush() to take any data I write from python to dump it to the FPGA, but I need like a reverse flush or something to have it reread the file data into the mmap!)
Update:
As suggested in the comments, the mmap approach might not be the best one to implement what I need. However, I have now tried both in python and C, but using basic I/O functions (os.read/write in python, read/write in C) using the O_DIRECT flag. For most of these operations, I end up getting errno 22. Still looking into this....

After doing digging, I found out what I was doing wrong with the O_DIRECT flag. In my C and Python versions, I wasnt using memalign to create the buffer and wasn't doing block reads/writes. This post has a good explanation:
How can I read a file with read() and O_DIRECT in C++ on Linux?
So, in order to achieve what I am doing, this C program works as a basic example:
#include <stdio.h>
#include <fcntl.h>
#include <errno.h>
#define BLKSIZE 512
int main() {
int fd;
int x;
char* buf;
fd = open("/home/root/sd/fpga/cam.raw", O_RDWR | O_SYNC | O_DIRECT);
if (!fd) {
printf("Oh noes, no file!\n");
return -1;
}
printf("%d %d\n", fd, errno);
buf = (char*) memalign(BLKSIZE, BLKSIZE*2);
if (!buf) {
printf("Oh noes, no buf!\n");
return -1;
}
x = read(fd, buf, BLKSIZE);
printf("%d %d %x %x %x %x\n", x, errno, buf[0], buf[1], buf[2], buf[3]);
lseek(fd, 0, 0);
buf[0] = '1';
buf[1] = '2';
buf[2] = '3';
buf[3] = '4';
x = write(fd, buf, BLKSIZE);
printf("%d %d\n", fd, errno);
lseek(fd, 0, 0);
x = read(fd, buf, BLKSIZE);
printf("%d %d %x %x %x %x\n", x,errno, buf[0], buf[1], buf[2], buf[3]);
return 0;
}
This will work for my purposes, I didnt look how to do proper memory alignment to use Python's os.read/os.write functions in a similar way.

Related

How to read a file in Python written out by C++

I have one program written in C++ that outputs the data from several different types of arrays. For simplicity, I'm using ints and just writing them out one at a time to figure this out.
I need to be able to read the file in on Python, but clearly am missing something. I'm having trouble translating the concepts from C++ over to Python.
This is the C++ I have that's working - it writes out two numbers to a file and then reads that file back in (yes, I have to use the ostream.write() and istream.read() functions - that's how at the base level the library I'm using does it and I can't change it).
int main(int argc, char **argv) {
std::ofstream fout;
std::ifstream fin;
int outval1 = 1234;
int outval2 = 5678;
fout.open("out.txt");
fout.write(reinterpret_cast<const char*>(&outval1), sizeof(int));
fout.write(reinterpret_cast<const char*>(&outval2), sizeof(int));
fout.close();
int inval;
fin.open("out.txt");
while (fin.read(reinterpret_cast<char*>(&inval), sizeof(int))) {
std::cout << inval << std::endl;
}
fin.close();
return 0;
}
This is what I have on the Python side, but I know it's not correct. I don't think I should need to read in as binary but that's the only way it's working so far
with open("out.txt", "rb") as f:
while (byte := f.read(1)):
print(byte)

In the simple case you have provided, it is easy to write the Python code to read out 1234 and 5678 (assuming sizeof(int) is 4 bytes) by using int.from_bytes.
And you should open the file in binary mode.
import sys
with open("out.txt", "rb") as f:
while (byte := f.read(4)):
print(int.from_bytes(byte, sys.byteorder))
To deal with floats, you may want to try struct.unpack:
import struct
byte = f.read(4)
print(struct.unpack("f", byte)[0])

Convert float into String and send from C code to Python through Named Pipe

I would like to send float values from C code into Python code using named pipes. I am printing the received values into terminal in the Python side, however along with the value itself, gibberish characters are also displayed.
Pipe opening:
void Init_FIFO(void)
{
// FIFO file path
char * bldc_fifo = "/tmp/bldc_fifo";
// Creating the named FIFO -- mkfifo(<pathname>, <permission>)
mkfifo(bldc_fifo, 0666);
// Open FIFO to write/read data
fd_fifo = open(bldc_fifo, O_RDWR | O_NONBLOCK);
//fd_fifo = open(bldc_fifo, O_WRONLY | O_RDONLY | O_NONBLOCK);
}
For the conversion of float to string I use sprintf and the code is given below,
void SendDataOverFifo(float angle)
{
char str[64];
unsigned char writeBuffer[] = "Hello!";
Init_FIFO();
sprintf(str, "%f\n", angle);
write(fd_fifo, str, sizeof(str));
//write(fd_fifo, writeBuffer, sizeof(writeBuffer));
close(fd_fifo);
}
Then for receiving the code in the Python side, I use this
#!/usr/bin/python
import os
import errno
import time
FIFO = '/tmp/bldc_fifo'
try:
os.mkfifo(FIFO)
except OSError as oe:
if oe.errno != errno.EEXIST:
raise
print("Opening FIFO...")
with open(FIFO, encoding='utf-8', errors='ignore') as fifo:
print("FIFO opened")
while True:
time.sleep(0.1)
data = fifo.read()
print(data)
The output I am getting is something like this
i-W ?UOeiEU11.417070
Where the correct result should be:
11.417070
A note: If I try to send only "Hello!", it works without any problems.
What am I missing here? Thanks in advance.

The first red flag is in the sprintf call; it doesn't know how large your target buffer str is, so could overflow if you're not careful. With a single float and 64 bytes, that step should be fine.
However, you didn't store the return value, so at this point you don't know how large the formatted text is. Then you used sizeof, which tells you how large the buffer is, not how much data you just put into it. You could use a string-based function (since sprintf wrote a nul-terminated string), such as strlen (to measure the string) or fputs (to write the string to file).
A much easier shortcut might be to use fprintf in the first place, and not need to allocate a separate buffer (it likely uses one built into FILE) to store the formatted string.
It is possible, albeit not necessarily portable or safe, to convert between file descriptors (such as write and close use) and FILE (such as fprintf uses) using functions such as fdopen.

The line:
write(fd_fifo, str, sizeof(str));
is causing unintialized memory to be written to the fifo. You don't want to write the whole str buffer, only the size of the string you want to pass. And you can find that out by snprintf return value of by using strlen(str).
int ret = sprintf(str, "%f", ...);
assert(ret > 0); // just to be safe
write(fd_fifo, str, ret);
Using sprintf is unsafe for you cause. Use snprintf to protect against stack overflow.
int ret = snprintf(str, sizeof(str), ....
// no changes
That way sprintf will never write more than sizeof(str) characters into the buffer.
However the best way is to not have a statically allocated buffer. You can use fdopen:
FILE *f = fdopen(fd_fifo, "w");
if (f == NULL) {
// handle error
}
int ret = fprintf(f, "%f", ...);
if (ret < 0) {
// handle error
}
fclose(f);
or get to know the size of buffer beforehand, call malloc, and snprintf again:
int ret = sprintf(NULL, "%f", ...);
assert(ret > 0);
char *str = malloc(ret * sizeof(char));
if (str == NULL) {
// handler error
}
ret = snprintf(str, "%f", ...);
write(fd_fifo, str, ret);
free(str);

I solved the problem, the solution was changing this line
write(fd_fifo, str, sizeof(str));
to
write(fd_fifo, str, strlen(str));

.bin to .cfile flowgraph for GRC 3.7.2.1

I have tried opening the flow graph for coverting .bin file (data
captured via RTL-SDR) to .cfile for analysis. I downloaded the file from
the link http://sdr.osmocom.org/trac/attachment/wiki/rtl-sd...
However, I am unable to get it working on GRC 3.7.2.1. I get a long list of error messages (given below) when I just try to open the file.
I am using Ubuntu v14.04.1.
I would be really grateful for any help to solve this or any alternate ways to convert the .bin file to .cfile (python source code?)
=======================================================
<<< Welcome to GNU Radio Companion 3.7.2.1 >>>
Showing: ""
Loading: "/home/zorro/Downloads/rtl2832-cfile.grc"
Error:
/home/zorro/Downloads/rtl2832-cfile.grc:2:0:ERROR:VALID:DTD_UNKNOWN_ELEM:
No declaration for element html
/home/zorro/Downloads/rtl2832-cfile.grc:2:0:ERROR:VALID:DTD_UNKNOWN_ATTRIBUTE:
No declaration for attribute xmlns of element html
/home/zorro/Downloads/rtl2832-cfile.grc:9:0:ERROR:VALID:DTD_UNKNOWN_ELEM:
No declaration for element head
/home/zorro/Downloads/rtl2832-cfile.grc:10:0:ERROR:VALID:DTD_UNKNOWN_ELEM:

The cause of the errors you are seeing is that your link is bad — it is truncated and points to a HTML page, not a GRC file. The errors come from GRC trying to interpret the HTML as GRC XML instead. The correct link to the download is: http://sdr.osmocom.org/trac/raw-attachment/wiki/rtl-sdr/rtl2832-cfile.grc
However, note that that flowgraph was built for GNU Radio 3.6 and will not work in GNU Radio 3.7 due to many blocks being internally renamed. I would recommend rebuilding it from scratch using the provided picture.
Since there are no variables in this flowgraph, you can simply drag out the blocks and set the parameters as shown. Doing so will be a good exercise for familiarizing yourself with the GNU Radio Companion user interface, too.

If you look at the flowgraph posted by #Kevin Reid above, you can see that it takes the input data, subtracts 127, multiplies by 0.008, and converts pairs to complex.
What is missing is the exact types. It is in the GNU Radio FAQ. From there we learn that the uchar is an unsigned char (8 bits) and the complex data type is a 'complex64' in python.
If done in numpy, as an in-memory operation, it looks like this:
import numpy as np
import sys
(scriptName, inFileName, outFileName) = sys.argv;
ubytes = np.fromfile(inFileName, dtype='uint8', count=-1)
# we need an even number of bytes
# discard last byte if the count is odd
if len(ubytes)%2==1:
ubytes = ubytes[0:-1]
print "read "+str(len(ubytes))+" bytes from "+inFileName
# scale the unsigned byte data to become a float in the interval 0.0 to 1.0
ufloats = 0.008*(ubytes.astype(float)-127.0)
ufloats.shape = (len(ubytes)/2, 2)
# turn the pairs of floats into complex numbers, needed by gqrx and other gnuradio software
IQ_data = (ufloats[:,0]+1j*ufloats[:,1]).astype('complex64')
IQ_data.tofile(outFileName)
I've tested this translating from the rtl_sdr file format to the gqrx IQ sample input file format and it seems to work fine within what can fit in memory.
But beware this script only works with data where both input and output files can fit in memory. For input files larger than about 1/5 of system memory, which sdr recording can easily exceed, it would be better to read the bytes one at a time.
We can avoid memory-hogging by reading the data 1 byte at a time with a loop, as with the following program in gnu C. This isn't the cleanest code, I should probably add fclose and check ferror, but it works as-is for hobby purposes.
#include <complex.h>
#include <stdio.h>
#include <stdlib.h>
// rtlsdr-to-gqrx Copyright 2014 Paul Brewer KI6CQ
// License: CC BY-SA 3.0 or GNU GPL 3.0
// IQ file converter
// from rtl_sdr recording format -- interleaved unsigned char
// to gqrx/gnuradio .cfile playback format -- complex64
void main(int argc, char *argv[])
{
int byte1, byte2; // int -- not unsigned char -- see fgetc man page
float _Complex fc;
const size_t fc_size = sizeof(fc);
FILE *infile,*outfile;
const float scale = 1.0/128.0;
const char *infilename = argv[1];
const char *outfilename = argv[2];
if (argc<3){
printf("usage: rtlsdr-to-gqrx infile outfile\n");
exit(1);
}
// printf("in= %s out= %s \n", infilename, outfilename);
infile=fopen(infilename,"rb");
outfile=fopen(outfilename,"wb");
if ((infile==NULL) || (outfile==NULL)){
printf("Error opening files\n");
exit(1);
}
while ((byte1=fgetc(infile)) != EOF){
if ((byte2=fgetc(infile)) == EOF){
exit(0);
}
fc = scale*(byte1-127) + I*scale*(byte2-127);
fwrite(&fc,fc_size,1,outfile);
}
}

Make math on numbers being in the specific part of a file

I have many files containing :
data: numbers that I have to use/manipulate, formatted in a specific way, specified in the following,
rows that I need just as they are (configurations of the software use these files).
The files most of time are huge, many millions of rows, and can't be handled fast enough with bash. I have made a script that checks each line to see if it's data, writing them to another file (without calculations), but it's very slow (many thousand rows per second).
The data is formatted in a way like this:
text
text
(
($data $data $data)
($data $data $data)
($data $data $data)
)
text
text
(
($data $data $data)
($data $data $data)
)
text
( text )
( text )
(text text)
I have to make another file, using $data, that should be the results of some operation with it.
The portions of file that contains numbers can be distinguished by the presence of this occurrence:
(
(
and the same:
)
)
at the end.
I've made before a C++ program that makes the operation I want, but for files containing columns of numbers only. I don't know how to ignore the text that I don't have to modify and handle the way the data is formatted.
Where do I have to look to solve my problem smartly?
Which should be the best way to handle data files, formatted in different ways, and make math with them? Maybe Python?

Are you sure that the shell isn't fast enough? Maybe your bash just needs improved. :)
It appears that you want to print every line after a line with just a ( until you get to a closing ). So...
#!/usr/bin/ksh
print=0
while read
do
if [[ "$REPLY" == ')' ]]
then
print=0
elif [[ "$print" == 1 ]]
then
echo "${REPLY//[()]/}"
elif [[ "$REPLY" == '(' ]]
then
print=1
fi
done
exit 0
And, with your provided test data:
danny#machine:~$ ./test.sh < file
$data $data $data
$data $data $data
$data $data $data
$data $data $data
$data $data $data
I'll bet you'll find that to be roughly as fast as anything else you would write. If I was going to be using this often, I'd be inclined to add several more error checks - but if your data is well-formed, this will work fine.
Alternatively, you could just use sed.
danny#machine:~$ sed -n '/^($/,/^)$/{/^[()]$/d;s/[()]//gp}' file
$data $data $data
$data $data $data
$data $data $data
$data $data $data
$data $data $data
performance note edit:
I was comparing python implementations below, so I thought I'd test these as well. The sed solution runs about identically to the fastest python implementation on the same data - less than one second (0.9 seconds) to filter ~80K lines. The bash version takes 42.5 seconds to do it. However, just replacing #!/bin/bash with #!/usr/bin/ksh above (which is ksh93, on Ubuntu 13.10) and making no other changes to the script reduces runtime down to 10.5 seconds. Still slower than python or sed, but that's part of why I hate scripting in bash.
I also updated both solutions to remove the opening and closing parens, to be more consistent with the other answers.

Here is something which should perform well on huge data, and it's using Python 3:
#!/usr/bin/python3
import mmap
fi = open('so23434490in.txt', 'rb')
m = mmap.mmap(fi.fileno(), 0, access=mmap.ACCESS_READ)
fo = open('so23434490out.txt', 'wb')
p2 = 0
while True:
p1 = m.find(b'(\n(', p2)
if p1 == -1:
break
p2 = m.find(b')\n)', p1)
if p2 == -1:
break # unmatched opening sequence!
data = m[p1+3:p2]
data = data.replace(b'(',b'').replace(b')',b'')
# Now decide: either do some computation on that data in Python
for line in data.split(b'\n'):
cols = list(map(float, data.split(b' ')))
# perform some operation on cols
# Or simply write out the data to use it as input for your C++ code
fo.write(data)
fo.write(b'\n')
fo.close()
m.close()
fi.close()
This uses mmap to map the file into memory. Then you can access it easily without having to worry about reading it in. It also is very efficient, since it can avoid unneccessary copying (from the page cache to the application heap).

I guess we need a perl solution, too.
#!/usr/bin/perl
my $p=0;
while(<STDIN>){
if( /^\)\s*$/ ){
$p = 0;
}
elsif( $p ){
s/[()]//g;
print;
}
elsif( /^\(\s*$/ ){
$p = 1;
}
}
On my system, this runs slightly slower than the fastest python implementation from above (while also doing the parenthesis removal), and about the same as
sed -n '/^($/,/^)$/{/^[()]$/d;s/[()]//gp}'

Using C provides much better speed than bash/ksh or C++(or Python, even though saying that stings). I created a text file containing 18 million lines containing the example text duplicated 1 million times. On my laptop, this C program works with the file in 1 second, while the Python version takes 5 seconds, and running the bash version under ksh(because it's faster than bash) with the edits mentioned in that answer's comments takes 1 minute 20 seconds(a.k.a 80 seconds). Note that this C program doesn't check for errors at all except for the non-existent file. Here it is:
#include <string.h>
#include <stdio.h>
#define BUFSZ 1024
// I highly doubt there are lines longer than 1024 characters
int main()
{
int is_area=0;
char line[BUFSZ];
FILE* f;
if ((f = fopen("out.txt", "r")) != NULL)
{
while (fgets(line, BUFSZ, f))
{
if (line[0] == ')') is_area=0;
else if (is_area) fputs(line, stdout); // NO NEWLINE!
else if (strcmp(line, "(\n") == 0) is_area=1;
}
}
else
{
fprintf(stderr, "THE SKY IS FALLING!!!\n");
return 1;
}
return 0;
}
If the fact it's completely unsafe freaks you out, here's a C++ version, which took 2 seconds:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
// ^ FYI, the above is a bad idea, but I'm trying to preserve clarity
int main()
{
ifstream in("out.txt");
string line;
bool is_area(false);
while (getline(in, line))
{
if (line[0] == ')') is_area = false;
else if (is_area) cout << line << '\n';
else if(line == "(") is_area = true;
}
return 0;
}
EDIT: As MvG pointed out in the comments, I wasn't benching the Python version fairly. It doesn't take 24 seconds as I originally stated, but 5 instead.

How to read packet saved in a file with Python?

I have a C++ code that generates an IP Packet Header. The code use a struct representing each field in the packet:
struct cip {
uint8_t ip_hl:4, /* both fields are 4 bytes */
ip_v:4;
uint8_t ip_tos;
uint16_t ip_len;
uint16_t ip_id;
uint16_t ip_off;
uint8_t ip_ttl;
uint8_t ip_p;
uint16_t ip_sum;
struct in_addr ip_src;
struct in_addr ip_dst;
char head[100];
};
The user is prompt an input message to enter the values for each variable in the struct:
Enter the filename to save the packet: packet
Enter IP version(0-15): 4
Enter Header Length(5-15): 5
Enter type of service(0-255): 55
Enter packet total size(bytes, 20, 200): 25
The packet is created and saved in a file:
FILE* f = fopen(file, "w");
int success = fwrite(&packet, sizeof(char), ((unsigned int)packet.ip_hl)*4,f);
if(success <= 0) {
printf("Error writing packet header");
}
success = fwrite(&data, sizeof(char),ntohs(packet.ip_len)-(4*packet.ip_hl),f);
if(success < 0) {
printf("Error writing packet data");
}
fflush(f);
fclose(f);
printf("\nPacket Written.\n");
I didn't create this code, someone gave me the code so I can create other program in Python that will validate the packet created by the program above. The validation includes verifying the checksum generated for the packet, the version of the Ip Packet, protocol, length of header and so on.
So I will like to know if someone can help me figuring out how can I read the file and parse the frame. I tried to read the line in the file as a string, but the problem I'm having is that the file looks like this after the creation: (it is unreadable)
O È ,# šÀ¨À¨
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxDATA_______________________DATA____________ ô·
I don't understand why: (I'm guessing that this is because the variables bigger than 1 byte are converted to big endian by the function "htons":
printf("\nEnter ip ID number(0-65535):\n");
scanf("%d", &input);
packet.ip_id = htons(input);
I tried to search for another option as dealing this with socket.makefile(), but this will help me the socket in my program as a file, but what I need to do is parse the frame gave to me in this file.
Any ideas?
Thanks.
P.S.: Also can someone give me a link where I can find how to convert integer from big endian to small endian and vicerversa in Python. Thanks!

You should read file as usual (specifying "binary" mode for Windows):
with open("test.txt", 'br') as f:
for line in f.readlines():
# process lines
To unpack binary data you should use struct package, which can also handle big and little endian and so on. Example for your struct:
print struct.unpack('BBHHHBBH100s', line)
I omitted ip_src and ip_dst unpacking since you didn't specify the contents of their struct. The least possible value to read is one byte, so to split first field into two parts you can use:
(ip_hl, ip_v) = (value >> 4, value & 15)
Of course, the order of 8-bit component depends on your struct endianess.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Disable Cache/Buffer on Specific File (Linux) - python

Related

How to read a file in Python written out by C++

Convert float into String and send from C code to Python through Named Pipe

.bin to .cfile flowgraph for GRC 3.7.2.1

Make math on numbers being in the specific part of a file

How to read packet saved in a file with Python?

Categories

Resources