Objective
I'm trying to extract the GPS "Latitude" and "Longitude" data from a bunch of JPG's and I have been successful so far but my main problem is that when I try to write the coordinates to a text file for example I see that only 1 set of coordinates was written compared to my console output which shows that every image was extracted. Here is an example: Console Output and here is my text file that is supposed be a mirror output along my console: Text file
I don't fully understand whats the problem and why it won't just write all of them instead of one. I believe it is being overwritten somehow or the 'GPSPhoto' module is causing some issues.
Code
from glob import glob
from GPSPhoto import gpsphoto
# Scan jpg's that are located in the same directory.
data = glob("*.jpg")
# Scan contents of images and GPS values.
for x in data:
data = gpsphoto.getGPSData(x)
data = [data.get("Latitude"), data.get("Longitude")]
print("\nsource: {}".format(x), "\n ↪ {}".format(data))
# Write coordinates to a text file.
with open('output.txt', 'w') as f:
print('Coordinates:', data, file=f)
I have tried pretty much everything that I can think of including: changing the write permissions, not using glob, no loops, loops, lists, no lists, different ways to write to the file, etc.
Any help is appreciated because I am completely lost at this point. Thank you.
You're replacing the data variable each time through the loop, not appending to a list.
all_coords = []
for x in data:
data = gpsphoto.getGPSData(x)
all_coords.append([data.get("Latitude"), data.get("Longitude")])
with open('output.txt', 'w') as f:
print('Coordinates:', all_coords, file=f)
Related
I am new here to try to solve one of my interesting questions in World of Tanks. I heard that every battle data is reserved in the client's disk in the Wargaming.net folder because I want to make a batch of data analysis for our clan's battle performances.
image
It is said that these .dat files are a kind of json files, so I tried to use a couple of lines of Python code to read but failed.
import json
f = open('ex.dat', 'r', encoding='unicode_escape')
content = f.read()
a = json.loads(content)
print(type(a))
print(a)
f.close()
The code is very simple and obviously fails to make it. Well, could anyone tell me the truth about that?
Added on Feb. 9th, 2022
After I tried another set of codes via Jupyter Notebook, it seems like something can be shown from the .dat files
import struct
import numpy as np
import matplotlib.pyplot as plt
import io
with open('C:/Users/xukun/Desktop/br/ex.dat', 'rb') as f:
fbuff = io.BufferedReader(f)
N = len(fbuff.read())
print('byte length: ', N)
with open('C:/Users/xukun/Desktop/br/ex.dat', 'rb') as f:
data =struct.unpack('b'*N, f.read(1*N))
The result is a set of tuple but I have no idea how to deal with it now.
Here's how you can parse some parts of it.
import pickle
import zlib
file = '4402905758116487.dat'
cache_file = open(file, 'rb') # This can be improved to not keep the file opened.
# Converting pickle items from python2 to python3 you need to use the "bytes" encoding or "latin1".
legacyBattleResultVersion, brAllDataRaw = pickle.load(cache_file, encoding='bytes', errors='ignore')
arenaUniqueID, brAccount, brVehicleRaw, brOtherDataRaw = brAllDataRaw
# The data stored inside the pickled file will be a compressed pickle again.
vehicle_data = pickle.loads(zlib.decompress(brVehicleRaw), encoding='latin1')
account_data = pickle.loads(zlib.decompress(brAccount), encoding='latin1')
brCommon, brPlayersInfo, brPlayersVehicle, brPlayersResult = pickle.loads(zlib.decompress(brOtherDataRaw), encoding='latin1')
# Lastly you can print all of these and see a lot of data inside.
The response contains a mixture of more binary files as well as some data captured from the replays.
This is not a complete solution but it's a decent start to parsing these files.
First you can look at the replay file itself in a text editor. But it won't show the code at the beginning of the file that has to be cleaned out. Then there is a ton of info that you have to read in and figure out but it is the stats for each player in the game. THEN it comes to the part that has to do with the actual replay. You don't need that stuff.
You can grab the player IDs and tank IDs from WoT developer area API if you want.
After loading the pickle files like gabzo mentioned, you will see that it is simply a list of values and without knowing what the value is referring to, its hard to make sense of it. The identifiers for the values can be extracted from your game installation:
import zipfile
WOT_PKG_PATH = "Your/Game/Path/res/packages/scripts.pkg"
BATTLE_RESULTS_PATH = "scripts/common/battle_results/"
archive = zipfile.ZipFile(WOT_PKG_PATH, 'r')
for file in archive.namelist():
if file.startswith(BATTLE_RESULTS_PATH):
archive.extract(file)
You can then decompile the python files(uncompyle6) and then go through the code to see the identifiers for the values.
One thing to note is that the list of values for the main pickle objects (like brAccount from gabzo's code) always has a checksum as the first value. You can use this to check whether you have the right order and the correct identifiers for the values. The way these checksums are generated can be seen in the decompiled python files.
I have been tackling this problem for some time (albeit in Rust): https://github.com/dacite/wot-battle-results-parser/tree/main/datfile_parser.
I've got a bunch of .dcm-files (dice-files) where I would like to extract the header and save the information there in a CSV file.
As you can see in the following picture, I've got a problem with the delimiters:
For example when looking at the second line in the picture: I'd like to split it like this:
0002 | 0000 | File Meta Information Group Length | UL | 174
But as you can see, I've not only multiple delimiters but also sometimes ' ' is one and sometimes not. Also the length of the 3rd column varies, so sometimes there is only a shorter text there, e.g. Image Type further down in the picture.
Does anyone have a clever idea, how to write it in a CSV file?
I use pydicom to read and display the files in my IDE.
I'd be very thankful for any advice :)
I would suggest going back to the data elements themselves and working from there, rather than from a string output (which is really meant for exploring in interactive sessions)
The following code should work for a dataset with no Sequences, would need some modification to work with sequences:
import csv
import pydicom
from pydicom.data import get_testdata_file
filename = get_testdata_file("CT_small.dcm") # substute your own filename here
ds = pydicom.dcmread(filename)
with open('my.csv', 'w', newline='') as csvfile:
writer = csv.writer(csvfile)
writer.writerow("Group Elem Description VR value".split())
for elem in ds:
writer.writerow([
f"{elem.tag.group:04X}", f"{elem.tag.element:04X}",
elem.description(), elem.VR, str(elem.value)
])
It may also require a bit of change to make the elem.value part look how you want it, or you may want to set the CSV writer to use quotes around items, etc.
Output looks like:
Group,Elem,Description,VR,value
0008,0005,Specific Character Set,CS,ISO_IR 100
0008,0008,Image Type,CS,"['ORIGINAL', 'PRIMARY', 'AXIAL']"
0008,0012,Instance Creation Date,DA,20040119
0008,0013,Instance Creation Time,TM,072731
...
I am using the Instaloader package to scrape some data from Instagram.
Ideally, I am looking to scrape the posts associated to a specific hashtag. I created the code below, and it outputs lines of scraped data, but I am unable to write this output to a .txt, .csv, or .json file.
I tried to first append the loop output to a list, but the list was empty. My efforts to output to a file have also been unsuccessful.
I know I am missing a step here, please provide any input that you have! Thank you.
import instaloader
import json
L= instaloader.Instaloader()
for posts in L.get_hashtag_posts('NewYorkCity'):
L.download_hashtag('NewYorkCity', max_count = 10)
with open('output.json', 'w') as f:
json.dump(posts, f)
break
print('Done')
Looking at your code, it seems the spacing might be a bit off. When you use the open command with the "with" statement it does a try: / finally: on the file object.
In your case you created a variable f representing this file object. I believe if you make the following change you can write your json data to the file.
import instaloader
import json
L= instaloader.Instaloader()
for posts in L.get_hashtag_posts('NewYorkCity'):
L.download_hashtag('NewYorkCity', max_count = 10)
with open('output.json', 'w') as f:
f.write(json.dump(posts))
break
print('Done')
I am sure you intended this, but if your goal in the break was to only get the first value in the loop returned you could make this edit as well
for posts in L.get_hashtag_posts('NewYorkCity')[0]:
This will return the first item in the list.
If you would like the first 3, for example, you could do [:3]
See this tutorial on Python Data Structures
I have a JSON file whose size is about 5GB. I neither know how the JSON file is structured nor the name of roots in the file. I'm not able to load the file in the local machine because of its size So, I'll be working on high computational servers.
I need to load the file in Python and print the first 'N' lines to understand the structure and Proceed further in data extraction. Is there a way in which we can load and print the first few lines of JSON in python?
If you want to do it in Python, you can do this:
N = 3
with open("data.json") as f:
for i in range(0, N):
print(f.readline(), end = '')
You can use the command head to display the N first line of the file. To get a sample of the json to know how is it structured.
And use this sample to work on your data extraction.
Best regards
I have a log record like this (millions of rows):
previous_status>SERVICE</previous_status><reason>1</>device_id>SENSORS</device_id><DEVICE>ISCS</device_type><status>OK
I would like to to extract all the words in capital into individual columns in excel using python to look like this :
SERVICE SENSORS DEVICE
As per the comments from #peter-wood, it isn't clear what your input is. However, assuming that your input is as you posted, then here is a minimal solution that works off the given structure. If it is not quite right, you should be able to easily change it to search on whatever is really your structure.
import csv
# You need to change this path.
lines = [row.strip() for row in open('/path/to/log.txt').readlines()]
# You need to change this path to where you want to write the file.
with open('/path/to/write/to/mydata.csv', 'w') as fh:
# If you want a different delimiter, like tabs '\t', change it here.
writer = csv.writer(fh, delimiter=',')
for l in lines:
# You can cut and paste the tokens that start and stop the pieces you are looking for here.
service = l[l.find('previous_status>')+len('previous_status>'):l.find('</previous_status')]
sensor = l[l.find('device_id>')+len('device_id>'):l.find('</device_id>')]
device = l[l.find('<DEVICE>')+len('<DEVICE>'):l.find('</device_type>')]
writer.writerow([service, sensor, device])