Testing data from two different source in Robot Framework

Testing data from two different source in Robot Framework - python

I want to test & validate the data from two different sources at the same time using robot framework.
I'm stuck & I don't know how to proceed further. I've come up with code as far now
${row_count}= get element count ${basic_info_table_row}
Should Be Equal As Integers ${row_count} 12
${column_count}= get element count ${basic_info_table_column}
Should Be Equal As Integers ${column_count} 2
${row_list}= BuiltIn.Create Dictionary
FOR ${row} IN RANGE ${row_count}+1
${row_text} get text ${basic_info_table_row}
log to console ${row_text}
END
Right now what happening is, it just taking the first row and just logging the the first row again & again.

You've got a loop with FOR ${row} IN RANGE ${row_count}+1 but you aren't using ${row} anywhere inside the loop. The two values in that block ${row_text} & ${basic_info_table_row} will remain the same each time.
If you wanted to use $row as an index, then use that with ${basic_info_table_row}. Or use $row in the xpath or whichever identifier is used for ${basic_info_table_row} (or just ${basic_info_table}) - that's not in your question.
See this answer for how you can use an index in your identifier. Note the tr[${row}] used in the xpath:
${row_text}= Get Text xpath=/html[1]/body[1]/div[5]/section[1]/div[6]/table[1]/tbody[1]/tr[${row}]/td[6]

Related

Can I amend one data sheet to match another data frame's ID that are almost similar?

I have multiple data frames to compare. My problem is the product IDs. one is set up like:
000-000-000-000
Vs
000-000-000
(gross)
I have looked on here, reddit, YouTube, and even went deep down the rabbit hole trying .join, .append, some other method I've never seen before, or even understand yet. Is there a way(or even better some documentation I can read on to learn this) to pull the Product ID from the Main excel sheet, compare it to the one(s) that should match. Then i will more than like make the in place ID across all sheets. That way I can use those IDs as the index and do a side by side compare of the ID to row data? Each ID has about 113 values to compare. That's 113 columns, but for each row if that make sense
Example: (colorful columns is main sheet that the non colored column will be compared to)
additional notes:
The highlighted yellow IDs are "unique", and I wont be changing those but instead write them to a list or something and use an if statement to ignore them when found.
Edit:
so I wrote this code which is almost perfect what I need to do with this.
It takes out the "-" which I apply to all my IDs. Just need to make a list of ID that are unique to skip over on taking away the zeros
dfSS["Product ID"] = dfSS["Product ID"].str.replace("-", "")
Then this will only list the digits up to 9 digits, except the unique IDs
dfSS["Product ID"] = dfSS["Product ID"]str[:9]
Will add the full code below here once i get it to work 100%
I am now trying to figure out how to say somethin like
lst =[1,2,3,4,5]
if dfSS["Product ID"] not in lst:
dfSS["Product ID"] = dfSS["Product ID"].str.replace("-", "").str[:9]
This code does not work but everyday I get closer and closer to being able to compare these similar yet different data frames. the lst is just an example of a 000-000-000 Product IDs in a list that I do not want to filter at all. but keep in the data frame

If the ID transformation is predictable, then one option is to use regex for homogenizing IDs. For example if the situation is just removing the first three digits, then something like the following can be used:
df['short_id'] = df['long_id'].str.extract(r'\d\d\d-([\d-]*)')
If the ID transformation is not so predictable (e.g. due to transcription errors or some other noise in the data) then the best option is to first disambiguate the ID transformation using something like recordlinkage, see the example here.

Ok solved this for every Product ID with or without dashes, #, ltters, etc..
(\d\d\d-)?[_#\d-]?[a-zA-Z]?
(\d\d\d-)? -This is for the first & second three integer sets, w/ zero or more matches and a dashes (non-greedy)
[_#\d-]? - This is for any special chars and additional numbers (non-greedy)
[a-zA-Z]? - This, not sure why, but I had to separate from the last part due to it wouldn't pick up every letter. (non-greedy)
With the above I solved everything I needed for RE.
Where I learned how to improve my RE skills:
RE Documentation
Automate the Boring Stuff- Ch 7
You can test you RE's here
Additional way to show this. Put this here to show there is no one way of doing it. RE is super awesome:
(\d{3}-)?[_#\d{3}-]?[a-zA-Z]?

Can I group graphite results by regex?

I've been using graphite for some time now in order to power our backend pythonic program. As part of my usage of it, I need to sum (using sumSeries) different metrics using wildcards.
Thing is, I need to group them according to a pattern; say I have the following range of metric names:
group.*.item.*
I need to sum the values of all items, for a given group (meaning: group.1.item.*, group.2.item.*, etc)
Unfortunately, I do not know in advance the set of existing group values, and so what I do right now is that I query metrics/index.json, parse the list, and generated the desired query (manually creating sumSeries(group.NUMBER.item.*) for every NUMBER I find in the metrics index).
I was wondering if there was a way to have graphite do this for me, and save the first roundtrip, as the communication and pre-processing are costly (taking more than half the time of the entire process)
Thanks in advance!

If you want a separate line for each group you could use the groupByNode function.
groupByNode(group.*.item.*, 1, "sumSeries")
Where '1' is the node you're selecting (indexed by 0) and "sumSeries" is the function you are feeding each group into.
You can read more about this here: http://graphite.readthedocs.io/en/latest/functions.html#graphite.render.functions.groupByNode
If you want to restrict the second node to only numeric values you can use a character range. You do this by specifying the range in square brackets [...]. A character range is indicated by 2 characters separated by a dash (-).
group.[0-9].item.*
You can read more about this here:
http://graphite.readthedocs.io/en/latest/render_api.html#paths-and-wildcards

Converting a list of strings to a single pattern

I have a list of strings that follow a specific pattern. Here's an example
['ratelimiter:foobar:201401011157',
'ratelimiter:foobar:201401011158',
'ratelimiter:foobar:201401011159',
'ratelimiter:foobar:201401011200']
I'm trying to end up with a blob pattern that will represent this list like the following
'ratelimiter:foobar:201401011*
I know the first two fields ahead of time. The third field is a time stamp and I want to find the column at which they start to have different values from other values in the column.
In the example given the timestamp ranges from 2014-01-01-11:57 to 2014-01-01-12:00 and the column that's different is the third to the last column where 1 changes to 2. If I can find that, then I can slice the string to [:-3] += '*' (for this example)
Every time I try and tackle this problem I end up with loops everywhere. I just feel like there's a better way of doing this.
Or maybe someone knows a better way of doing this with redis. I'm doing this because I'm trying to get keys from redis and I don't want to make a request for every key but rather make a batch request using the pattern parameter. Maybe there's a better way of doing this but haven't found anything yet.
Thanks

Staying in the pattern thing (converting to timestamp is probably best, though), I would do that to find the longest prefix:
items = ['ratelimiter:foobar:201401011157',
'ratelimiter:foobar:201401011158',
'ratelimiter:foobar:201401011159',
'ratelimiter:foobar:201401011200']
print items[0][:[len(set(x)) == 1 for x in zip(*items)].index(False)] + '*'
# ratelimiter:foobar:201401011*
Which reads as: cut the first element of items where all nth elements of items are no longer equals.
[len(set(x)) == 1 for x in zip(*items)] will return a list of boolean being True for i if all elements at i are equal across items

This is what I will do:
convert the timestamp to numbers
find the max and min (if your list is not ordered)
take the difference between max and min and convert it back to pattern.
For example, in your case, the difference between max and min is 43. And the min is already 57, you can quickly deduct that if the min ends with ***157, the max should be ***200. And you know the pattern

You almost never want to use the '*' parameter in Redis in production because it is very slow-- much slower than making a request for each key individually in the vast majority of cases. Unless you're requesting so many keys that your bottleneck becomes the sheer amount of data you're transferring over the network (in which case you should really convert things to Lua and run the logic server-side), a pipeline is really want you want.
The reason you want a pipeline is you're probably getting hit by the costs of transferring data back and forth between your Redis server in separate hops right now. A pipeline, in contrast, queues up a bunch of commands to run against Redis, and then executes them all at once, when you're ready. Assuming you're using redis-py (if you're not, you really should be), and r is your connection to your Redis server, you can do this like so:
r = redis.Redis(...)
pipe = r.pipeline()
items = ['ratelimiter:foobar:201401011157',
'ratelimiter:foobar:201401011158',
'ratelimiter:foobar:201401011159',
'ratelimiter:foobar:201401011200']
for item in items:
pipe.get(item)
#all the values for each item you're getting from Redis will be here.
item_values = pipe.execute()
Note: this will only make one call to Redis and will be much faster than either getting each value individually or running a pattern selection.
All of the other answers so far are good Python answers, but you're dealing with a Redis problem. You need a Redis answer.

How to find and replace 6 digit numbers within HREF links from map of values across site files, ideally using SED/Python

I need to create a BASH script, ideally using SED to find and replace value lists in href URL link constructs with HTML sit files, looking-up in a map (old to new values), that have a given URL construct. There are around 25K site files to look through, and the map has around 6,000 entries that I have to search through.
All old and new values have 6 digits.
The URL construct is:
One value:
HREF=".*jsp\?.*N=[0-9]{1,}.*"
List of values:
HREF=".*\.jsp\?.*N=[0-9]{1,}+N=[0-9]{1,}+N=[0-9]{1,}...*"
The list of values are delimited by + PLUS symbol, and the list can be 1 to n values in length.
I want to ignore a construct such as this:
HREF=".*\.jsp\?.*N=0.*"
IE the list is only N=0
Effectively I'm only interested in URL's that include one or more values that are in the file map, that are not prepended with CHANGED -- IE the list requires updating.
PLEASE NOTE: in the above construct examples: .* means any character that isn't a digit; I'm just interested in any 6 digit values in the list of values after N=; so I've trying to isolate the N= list from the rest of the URL construct, and it should be noted that this N= list can appear anywhere within this URL construct.
Initially, I want to create a script that will create a report of all links that fulfills the above criteria and that have a 6 digital OLD value that's in the map file, with its file path, to get an understanding of links impacted. EG:
Filename link
filea.jsp /jsp/search/results.jsp?N=204200+731&Ntx=mode+matchallpartial&Ntk=gensearch&Ntt=
filea.jsp /jsp/search/BROWSE.jsp?Ntx=mode+matchallpartial&N=213890+217867+731&
fileb.jsp /jsp/search/results.jsp?N=0+450+207827+213767&Ntx=mode+matchallpartial&Ntk=gensearch&Ntt=
Lastly, I'd like to find and replace all 6 digit numbers, within the URL construct lists, as outlined above, as efficiently as possible (I'd like it to be reasonably fast as there could be around 25K files, with 6K values to look up, with potentially multiple values in the list).
**PLEASE NOTE:** There is an additional issue I have, when finding and replacing, is that an old value could have been assigned a new value, that's already been used, that may also have to be replaced.
E.G. If the map file is as below:
MAP-FILE.txt
OLD NEW
214865 218494
214866 217854
214867 214868
214868 218633
... ...
and there is a HREF link such as:
/jsp/search/results.jsp?Ntx=mode+matchallpartial&Ntk=gensearch&N=0+450+214867+214868
214867 changes to 214868 - this would need to be prepended to flag that this value has been changed, and should not be replaced, otherwise what was 214867 would become 218633 as all 214868 would be changed to 218633. Hope this makes sense - I would then need to run through file and remove all 6 digit numbers that had been marked with the prepended flag, such that link would become:
/jsp/search/results.jsp?Ntx=mode+matchallpartial&Ntk=gensearch&N=0+450+214868CHANGED+218633CHANGED
Unless there's a better way to manage these infile changes.
Could someone please help me on this, I'm note an expert with these kind of changes - so help would be massively appreciated.
Many thanks in advance,
Alex

I will write the outline for the code in some kind of pseudocode. And I don't remember Python well to quickly write the code in Python.
First find what type it is (if contains N=0 then type 3, if contains "+" then type 2, else type 1) and get a list of strings containing "N=..." by exploding (name of PHP function) by "+" sign.
The first loop is on links. The second loop is for each N= number. The third loop looks in map file and finds the replacing value. Load the data of the map file to a variable before all the loops. File reading is the slowest operation you have in programming.
You replace the value in the third loop, then implode (PHP function) the list of new strings to a new link when returning to a first loop.
Probably you have several files with the links then you need another loop for the files.
When dealing with repeated codes you nees a while loop until spare number found. And you need to save the numbers that are already used in a list.

how can i change the size of a table in word using python (pywin32)

ms word table with python
I am working with python on word tables, i am generating tables, but all of them are
auto fit window..
is it possible to change it to auto fit contents?
i had tried something like this:
table = location.Tables.Add(location,len(df)+1,len(df.columns)
table.AutoFit(AutoFitBehavior.AutoFitToContents)
but it keeps to raise errors

You want to change you table creation to use this:
//''#Add two ones after your columns
table = location.Tables.Add(location,len(df)+1,len(df.columns),1,1)
Information about why you need those variables can be read here:
http://msdn.microsoft.com/en-us/library/office/ff845710(v=office.15).aspx
But basically, the default behavior is to disable Cell Autofitting and Use Table Autofit to Window. The first "1" enables Cell Autofitting. From the link I posted above, the DefaultTableBehavior can either be wdWord8TableBehavior (Autofit disabled --default) or wdWord9TableBehavior (Autofit enabled). The number comes from opening up Word's VBA editor and typing in the Immediate Window:
?Word.wdWord9TableBehavior
Next, from the link, we see another option called AutoFitBehavior. This is defined as:
Sets the AutoFit rules for how Word sizes tables. Can be one of the WdAutoFitBehavior constants.
So now we have another term to look up. In the VBA editor's Immediate window again type:
?Word.wdAutoFitBehavior.
After the last dot, the possible options should appear. These will be:
wdAutoFitContent
wdAutoFitFixed
wdAutoFitWindow
AutoFitContent looks to be the option we want, so let's finish up that previous line with:
?Word.wdAutoFitBehavior.wdAutoFitContent
The result will be a "1".
Now you may ask, why do we have to go through all this trouble finding the numerical representations of the values. From my experience, with using pywin32 with Excel, is that you can't get the Built-in values, from the string, most of the time. But putting in the numerical representation works just the same.
Also, One more reason for why your code may be failing is that the table object may not have a function "Autofit".
I'm using Word 2007, and Table has the function, AutoFitBehavior.
So Change:
table.AutoFit(AutoFitBehaviour.AutoFitToContent)
To:
table.AutoFitBehavior(1)
//''Which we know the 1 means wd.wdAutoFitBehavior.wdAutoFitContent
Hope I got it right, and this helps you out.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.