How to access and replace values in a dict-list? - python

[python]
I have a multi-stage dict-list. I want to access the location in the dict-list and replace it with list data and data_result. I am able access the location and append to the values to a variable. However, I don't know how to create a variable that stores the entire dict-list and the replaced/appended list.
Multi-stage structure, focusing on data and results list
Note: all parameters are set to a value.
data = {
"uuid": str(uuid),
"name": str(tag + "_" + testName + "_" + str(datetime.datetime.utcnow())),
"verdict": str(totalResult)
}
print(data)
data_result = {
"Test_Case": str(Test_Case),
"Test_Result": str(result),
"Time Evaluation": str(duration)
}
res = tcm_template ## ** HERE **
res2 = tcm_template['testcatalogmanager']['ut']['tests']
for res3 in res2:
res4 = res3['data']
for res5 in res4: # # res4 in data list
final_result = res5['results']
final_result.append(data_result)
#tcm_template.replace(data) ## # data - prototype code
#tcm_template.append(final_result) ## # results - prototype code

Related

Unable to get proper output of TFLite model in Kotlin

I used FER2013 dataset from kaggle and trained a CNN model. Saved the model as TFLite. Made a Kotlin app using it. Now I am not able to get proper output. Sample output of the model : [0. 0. 0. 1. 0. 0. 0.] for happy
Please check the code for MainActivity.kt. I am pure noobie. Thank you so much for bearing with me.
package com.example.mooddetector
import android.content.pm.PackageManager
import android.graphics.Bitmap
import android.os.Bundle
import android.util.Log
import android.widget.Button
import android.widget.ImageView
import android.widget.TextView
import android.widget.Toast
import androidx.activity.result.contract.ActivityResultContracts
import androidx.appcompat.app.AppCompatActivity
import androidx.core.content.ContextCompat
import com.example.mooddetector.databinding.ActivityMainBinding
import com.example.mooddetector.ml.MoodDetector
import org.tensorflow.lite.DataType
import org.tensorflow.lite.support.image.ColorSpaceType
import org.tensorflow.lite.support.image.TensorImage
import org.tensorflow.lite.support.image.ops.TransformToGrayscaleOp
import org.tensorflow.lite.support.tensorbuffer.TensorBuffer
class MainActivity : AppCompatActivity() {
private lateinit var binding: ActivityMainBinding
private lateinit var imageView: ImageView
private lateinit var button: Button
private lateinit var tvOutput: TextView
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
binding = ActivityMainBinding.inflate(layoutInflater)
val view = binding.root
setContentView(view)
imageView = binding.imageView
button = binding.btnCaptureImage
tvOutput = binding.tvOutput
val buttonLoad = binding.btnLoadImage
button.setOnClickListener {
if (ContextCompat.checkSelfPermission(this, android.Manifest.permission.CAMERA)
== PackageManager.PERMISSION_GRANTED
) {
takePicturePreview.launch(null)
} else {
requestPermission.launch(android.Manifest.permission.CAMERA)
}
}
}
private val requestPermission = registerForActivityResult(ActivityResultContracts.RequestPermission()) { granted ->
if (granted) {
takePicturePreview.launch(null)
} else {
Toast.makeText(this, "Permission Denied.", Toast.LENGTH_SHORT).show()
}
}
private val takePicturePreview = registerForActivityResult(ActivityResultContracts.TakePicturePreview()){bitmap->
if(bitmap!=null){
imageView.setImageBitmap(bitmap)
outputGenerator(bitmap)
}
}
private fun outputGenerator(bitmap: Bitmap){
val model = MoodDetector.newInstance(this)
val newBitmap = Bitmap.createScaledBitmap(bitmap, 48, 48, true)
val tfimage = TensorImage(DataType.FLOAT32)
tfimage.load(newBitmap)
val tfimagegrayscale = TransformToGrayscaleOp().apply(tfimage)
val tensorbuffr=tfimagegrayscale.tensorBuffer
val tensorimg = TensorImage(DataType.FLOAT32)
tensorimg.load(tensorbuffr,ColorSpaceType.GRAYSCALE)
val byteBuffer = tensorimg.buffer
// Creates inputs for reference.
val inputFeature0 = TensorBuffer.createFixedSize(intArrayOf(1, 48, 48, 1), DataType.FLOAT32)
inputFeature0.loadBuffer(byteBuffer)
// Runs model inference and gets result.
val outputs = model.process(inputFeature0)
val outputFeature0 = outputs.outputFeature0AsTensorBuffer
tvOutput.text = outputFeature0.toString()
Log.d("TAG", outputs.toString())
Log.d("TAG", outputFeature0.toString())
// val data1 = outputFeature0.floatArray
// Log.d("TAG2", outputFeature0.dataType.toString())
// Log.d("TAG2", data1[0].toString())
// val probabilityBuffer = TensorBuffer.createFixedSize(intArrayOf(1, 1001), DataType.UINT8)
// Releases model resources if no longer used.
model.close()
}
}
The output of the last 2 log files is:
com.example.mooddetector.ml.MoodDetector$Outputs#a04fe1
org.tensorflow.lite.support.tensorbuffer.TensorBufferFloat#ca3b548
The docs for TensorBufferFloat list 2 methods that might be useful to you
float[] getFloatArray()
Returns a float array of the values stored in this buffer.
float getFloatValue(int absIndex)
Returns a float value at a given index.
The docs are for Java, which means that in Kotlin all getters (and setters) just become normal field/property access, i.e. .getFloatArray() becomes just .floatArray
So if you get the whole array with .floatArray you can just join the values together with some separator to get a string representation.
val output = outputFeature0.floatArray.joinToString(", ", "[", "]")
Log.d("TAG", output)
If you want to control the formatting use the DecimalFormat
val pattern = "#.0#" // rounds to 2 decimal places if needed
val locale = Locale.ENGLISH
val formatter = DecimalFormat(pattern, DecimalFormatSymbols(locale))
formatter.roundingMode = RoundingMode.HALF_EVEN // this is the default rounding mode anyway
val output = outputFeature0.floatArray.joinToString(", ", "[", "]") { value ->
formatter.format(value)
}
Log.d("TAG", output)
If you need to do this formatting in different places you can move the logic into an extension method
fun FloatArray.joinToFormattedString(): String {
val pattern = "#.0#" // rounds to 2 decimal places if needed
val locale = Locale.ENGLISH
val formatter = DecimalFormat(pattern, DecimalFormatSymbols(locale))
formatter.roundingMode = RoundingMode.HALF_EVEN // this is the default rounding mode anyway
return this.joinToString(
separator = ", ",
prefix = "[",
postfix = "]",
) { value ->
formatter.format(value)
}
}
Then you can simply call
Log.d("TAG", outputFeature0.floatArray.joinToFormattedString())

PyParsing: parse if not a keyword

I am trying to parse a file as follows:
testp.txt
title = Test Suite A;
timeout = 10000
exp_delay = 500;
log = TRUE;
sect
{
type = typeA;
name = "HelloWorld";
output_log = "c:\test\out.log";
};
sect
{
name = "GoodbyeAll";
type = typeB;
comm1_req = 0xDEADBEEF;
comm1_resp = (int, 1234366);
};
The file first contains a section with parameters and then some sects. I can parse a file containing just parameters and I can parse a file just containing sects but I can't parse both.
from pyparsing import *
from pathlib import Path
command_req = Word(alphanums)
command_resp = "(" + delimitedList(Word(alphanums)) + ")"
kW = Word(alphas+'_', alphanums+'_') | command_req | command_resp
keyName = ~Literal("sect") + Word(alphas+'_', alphanums+'_') + FollowedBy("=")
keyValue = dblQuotedString.setParseAction( removeQuotes ) | OneOrMore(kW,stopOn=LineEnd())
param = dictOf(keyName, Suppress("=")+keyValue+Optional(Suppress(";")))
node = Group(Literal("sect") + Literal("{") + OneOrMore(param) + Literal("};"))
final = OneOrMore(node) | OneOrMore(param)
param.setDebug()
p = Path(__file__).with_name("testp.txt")
with open(p) as f:
try:
x = final.parseFile(f, parseAll=True)
print(x)
print("...")
dx = x.asDict()
print(dx)
except ParseException as pe:
print(pe)
The issue I have is that param matches against sect so it expects a =. So I tried putting in ~Literal("sect") in keyName but that just leads to another error:
Exception raised:Found unwanted token, "sect", found '\n' (at char 188), (line:4, col:56)
Expected end of text, found 's' (at char 190), (line:6, col:1)
How do I get it use one parse method for sect and another (param) if not sect?
My final goal would be to have the whole lot in a Dict with the global params and sects included.
EDIT
Think I've figured it out:
This line...
final = OneOrMore(node) | OneOrMore(param)
...should be:
final = ZeroOrMore(param) + ZeroOrMore(node)
But I wonder if there is a more structured way (as I'd ultimately like a dict)?

finding the same words in two files and leaving out not repeated ones in python

I have to write a program that correlates smoking with lung cancer risk. For that I have data in two files.
My code is computing the data given in the same lines (eg:America,23.3 with Spain,77.9 and
Italy,24.2 with Russia,60.8)
How to modify my code so that it computes the numbers of the same countries and leaves out the countries that occur only in one file (it shouldn't compute Germany, France, China, Korea because they are only in one file)
Thank you so much for your help in advance:)
smoking file:
Country, Percent Cigarette Smokers Data
America,23.3
Italy,24.2
Russia,23.7
France,14.9
England,17.9
Spain,17
Germany,21.7
second file:
Cases Lung Cancer per 100000
Spain,77.9
Russia,60.8
Korea,61.3
America,73.3
China,66.8
Vietnam,64.5
Italy,43.9
and my code:
def readFiles(smoking_datafile, cancer_datafile):
'''
Reads the data from the provided file objects smoking_datafile
and cancer_datafile. Returns a list of the data read from each
in a tuple of the form (smoking_datafile, cancer_datafile).
'''
# init
smoking_data = []
cancer_data = []
empty_str = ''
# read past file headers
smoking_datafile.readline()
cancer_datafile.readline()
# read data files
eof = False
while not eof:
# read line of data from each file
s_line = smoking_datafile.readline()
c_line = cancer_datafile.readline()
# check if at end-of-file of both files
if s_line == empty_str and c_line == empty_str:
eof = True
# check if end of smoking data file only
elif s_line == empty_str:
raise OSError('Unexpected end-of-file for smoking data file')
# check if at end of cancer data file only
elif c_line == empty_str:
raise OSError('Unexpected end-of-file for cancer data file')
# append line of data to each list
else:
smoking_data.append(s_line.strip().split(','))
cancer_data.append(c_line.strip().split(','))
# return list of data from each file
return (smoking_data, cancer_data)
def calculateCorrelation(smoking_data, cancer_data):
'''
Calculates and returns the correlation value for the data
provided in lists smoking_data and cancer_data
'''
# init
sum_smoking_vals = sum_cancer_vals = 0
sum_smoking_sqrd = sum_cancer_sqrd = 0
sum_products = 0
# calculate intermediate correlation values
num_values = len(smoking_data)
for k in range(0,num_values):
sum_smoking_vals = sum_smoking_vals + float(smoking_data[k][1])
sum_cancer_vals = sum_cancer_vals + float(cancer_data[k][1])
sum_smoking_sqrd = sum_smoking_sqrd + \
float(smoking_data[k][1]) ** 2
sum_cancer_sqrd = sum_cancer_sqrd + \
float(cancer_data[k][1]) ** 2
sum_products = sum_products + float(smoking_data[k][1]) * \
float(cancer_data[k][1])
# calculate and display correlation value
numer = (num_values * sum_products) - \
(sum_smoking_vals * sum_cancer_vals)
denom = math.sqrt(abs( \
((num_values * sum_smoking_sqrd) - (sum_smoking_vals ** 2)) * \
((num_values * sum_cancer_sqrd) - (sum_cancer_vals ** 2)) \
))
return numer / denom
Let's just focus on getting the data into a format that is easy to work with. The code below will get you a dictionary of the form ...
smokers_cancer_data = {
'America': {
'smokers': '23.3',
'cancer': '73.3'
},
'Italy': {
'smokers': '24.2',
'cancer': '43.9'
},
...
}
Once you have this you can get any values you need and perform your calculations. See the code below.
def read_data(filename: str) -> dict:
with open(filename, 'r') as file:
next(file) # Skip the header
data = dict();
for line in file:
cleaned_line = line.rstrip()
# Skip blank lines
if cleaned_line:
data_item = (cleaned_line.split(','))
data[data_item[0]] = float(data_item[1])
return data
# Load data into python dictionaries
smokers_data = read_data('smokersData.txt')
cancer_data = read_data('lungCancerData.txt')
# Build one dictionary that is easy to work with
smokers_cancer_data = dict()
for (key, value) in smokers_data.items():
if key in cancer_data:
smokers_cancer_data[key] = {
'smokers': smokers_data[key],
'cancer' : cancer_data[key]
}
print(smokers_cancer_data)
For example, if you want to calculate the sum of the smoker and cancer values.
smokers_total = 0
cancer_total = 0
for (key, value) in smokers_cancer_data.items():
smokers_total += value['smokers']
cancer_total += value['cancer']
This will return a list of all the countries that have datas, along with the data:
l3 = []
with open('smoking.txt','r') as f1, open('cancer.txt','r') as f2:
l1, l2 = f1.readlines(), f2.readlines()
for s1 in l1:
for s2 in l2:
if s1.split(',')[0] == s2.split(',')[0]:
cty = s1.split(',')[0]
smk = s1.split(',')[1].strip()
cnr = s2.split(',')[1].strip()
l3.append(f"{cty}: smoking: {smk}, cancer: {cnr}")
print(l3)
Output:
['Spain: smoking: 77.9, cancer: 17', 'Russia: smoking: 60.8, cancer: 23.7', 'America: smoking: 73.3, cancer: 23.3', 'Italy: smoking: 43.9, cancer24.2']

Optimize code to handle large chunks of data

I have the following code:
import json
data_sample = [{
"name":"John",
"age":30,
"cars":[ {
"temp":{
"sum":"20",
"for":12,
}
,
"id":30,
"element":[ {"model":"Taurus1", "doors":{"id":"1", "id2":101}}, {"model":"T1", "doors":{"id":"2", "id2":12}}, {"model":"As", "doors":{"id":"Mo", "id2":4}} ]
}, {
"temp":{
"sum":"10",
"for":12,
}
,
"id":31,
"element":[ {"model":"Taurus2", "doors":{"id":"2", "id2":102}}, {"model":"T2", "doors":{"id":"5", "id2":12}}, {"model":"Thing", "doors":{"id":"Fo", "id2":4}} ]
}, {
"temp":{
"sum":"20",
"for":10,
}
,
"id":32,
"element":[ {"model":"Taurus3", "doors":{"id":"3", "id2":103}}, {"model":"T3", "doors":{"id":"15", "id2":62}}, {"model":"By", "doors":{"id":"Log", "id2":4}} ]
} ]
}]
def flat_list(z):
x = []
for i, data_obj in enumerate(z):
if type(data_obj) is dict or type(data_obj) is list:
x.extend([flatten_data(data_obj)])
else:
x.extend([data_obj])
return x
def flatten_data(y):
out = {}
def flatten(x, name=''):
if type(x) is dict:
for a in x:
flatten(x[a], name + a + '_')
elif type(x) is list:
out[name[:-1]] = flat_list(x)
else:
out[name[:-1]] = x
flatten(y)
return out
def generatejson(response2):
# response 2 is [(first data set), (second data set)] convert it to dictionary {0: (first data set), 1: (second data set)}
sample_object = {i: data_response for i, data_response in enumerate(response2)}
flat = {k: flatten_data(v) for k, v in sample_object.items()}
return json.dumps(flat, sort_keys=True)
print generatejson(data_sample)
This code takes data from the following format:
[(first data set), (second data set)]
and begin to look for nesting dicts. If nesting dict is detected the code flats it to the parent level.
For example the code detects this:
doors is nested dict so it converts it to:
Note that it doesn't change the lists/arrays. They are not being flattened.
My issue:
On small amount of data the code works great however handling large amount of sets (1000+) the performance is very low... And sometimes even crash.
How can I improve and optimize the performance of this code?
The data_sample contains only 1 data set (I assume that's enough for checking).

splitting a diff file using regex in Python

I'm trying to split a diff (unified format) into each section using the re module in python. The format of a diff is like this...
diff --git a/src/core.js b/src/core.js
index 9c8314c..4242903 100644
--- a/src/core.js
+++ b/src/core.js
## -801,7 +801,7 ## jQuery.extend({
return proxy;
},
- // Mutifunctional method to get and set values to a collection
+ // Multifunctional method to get and set values of a collection
// The value/s can optionally be executed if it's a function
access: function( elems, fn, key, value, chainable, emptyGet, pass ) {
var exec,
diff --git a/src/sizzle b/src/sizzle
index fe2f618..feebbd7 160000
--- a/src/sizzle
+++ b/src/sizzle
## -1 +1 ##
-Subproject commit fe2f618106bb76857b229113d6d11653707d0b22
+Subproject commit feebbd7e053bff426444c7b348c776c99c7490ee
diff --git a/test/unit/manipulation.js b/test/unit/manipulation.js
index 18e1b8d..ff31c4d 100644
--- a/test/unit/manipulation.js
+++ b/test/unit/manipulation.js
## -7,7 +7,7 ## var bareObj = function(value) { return value; };
var functionReturningObj = function(value) { return (function() { return value; }); };
test("text()", function() {
- expect(4);
+ expect(5);
var expected = "This link has class=\"blog\": Simon Willison's Weblog";
equal( jQuery("#sap").text(), expected, "Check for merged text of more then one element." );
## -20,6 +20,10 ## test("text()", function() {
frag.appendChild( document.createTextNode("foo") );
equal( jQuery( frag ).text(), "foo", "Document Fragment Text node was retreived from .text().");
+
+ var $newLineTest = jQuery("<div>test<br/>testy</div>").appendTo("#moretests");
+ $newLineTest.find("br").replaceWith("\n");
+ equal( $newLineTest.text(), "test\ntesty", "text() does not remove new lines (#11153)" );
});
test("text(undefined)", function() {
diff --git a/version.txt b/version.txt
index 0a182f2..0330b0e 100644
--- a/version.txt
+++ b/version.txt
## -1 +1 ##
-1.7.2
\ No newline at end of file
+1.7.3pre
\ No newline at end of file
I've tried the following combinations of patterns but can't quite get it right. This is the closest I have come so far...
re.compile(r'(diff.*?[^\rdiff])', flags=re.S|re.M)
but this yields
['diff ', 'diff ', 'diff ', 'diff ']
How would I match all sections in this diff?
This does it:
r=re.compile(r'^(diff.*?)(?=^diff|\Z)', re.M | re.S)
for m in re.findall(r, s):
print '===='
print m
You don't need to use regex, just split the file:
diff_file = open('diff.txt', 'r')
diff_str = diff_file.read()
diff_split = ['diff --git%s' % x for x in diff_str.split('diff --git') \
if x.strip()]
print diff_split
Why are you using regex? How about just iterating over the lines and starting a new section when a line starts with diff?
list_of_diffs = []
temp_diff = ''
for line in patch:
if line.startswith('diff'):
list_of_diffs.append(temp_diff)
temp_diff = ''
else: temp_diff.append(line)
Disclaimer, above code should be considered illustrative pseudocode only and is not expected to actually run.
Regex is a hammer but your problem isn't a nail.
Just split on any linefeed that's followed by the word diff:
result = re.split(r"\n(?=diff\b)", subject)
Though for safety's sake, you probably should try to match \r or \r\n as well:
result = re.split(r"(?:\r\n|[\r\n])(?=diff\b)", subject)

Categories

Resources