Regex to capture the codes between specific function

Regex to capture the codes between specific function - python

I have been experimenting with python re, trying to capture specific variables between specific functions. To illustrate let me give an example of file contents of a php file :-
public function executeSomething ()
{
$this->title = 'Edit something';
$this->action = 'Edit';
$this->headerTitle = 'Edit something';
return true;
}
public function executeSomethingEdit ()
{
if (strlen ($this->somethingElse) > 0)
{
$this->titleText = "Update";
$title = 'Edit something';
}
else
{
$this->titleText = "Create";
$title = 'Create something';
}
$this->title = $title;
$this->headerTitle = $title;
$this->formTitle = 'Something details'
return true;
}
What the python script needs to do now is iterate through this file in search for functions that starts with 'public function execute' and get the content within the braces i.e { }. I have already came up with python code to achieve this i.e :-
r = re.compile(r"public function execute(.*?)\(\).*?{(.*?)}", re.DOTALL)
The problem occurs when I have a validation within the function i.e if else statement such as the one in the function executeSomethingEdit. The script doesn't takes into account whatever codes below the if statements closing braces '}'. Therefore I need to further enhance the python code to include the function declaration below i.e something like this :-
r = re.compile(r"public function execute(.*?)\(\).*?{(.*?)}.*?public function", re.DOTALL)
At the moment this code is not working/producing the result that I wanted. I need to used python's re specifically because i need to further analyse the content of {(.*?)}. I'm very new to python so I hope someone could direct me in the right direction or at least tell me what I'm doing wrong. Thanks in advance.

If the input PHP has no bugs and has consistent indentation, you could check for a non-space character before the closing brace.
r = re.compile(r'public function execute(.*?)\(\).*?{(.*?)[^ ]}', re.DOTALL)

Related

Expanding a Scribunto module that doesn't have a function

I want to get the return value of this Wikimedia Scribunto module in Python. Its source code is roughly like this:
local Languages = {}
Languages = {
["aa"] = {
name = "afarština",
dir = "ltr",
name_attr_gen_pl = "afarských"
},
-- More languages...
["zza"] = {
name = "zazaki",
dir = "ltr"
}
}
return Languages
In the Wiktextract library, there is already Python code to accomplish similar tasks:
def expand_template(sub_domain: str, text: str) -> str:
import requests
# https://www.mediawiki.org/wiki/API:Expandtemplates
params = {
"action": "expandtemplates",
"format": "json",
"text": text,
"prop": "wikitext",
"formatversion": "2",
}
r = requests.get(f"https://{sub_domain}.wiktionary.org/w/api.php",
params=params)
data = r.json()
return data["expandtemplates"]["wikitext"]
This works for languages like French because there the Scribunto module has a well-defined function that returns a value, as an example here:
Scribunto module:
p = {}
function p.affiche_langues_python(frame)
-- returns the needed stuff here
end
The associated Python function:
def get_fr_languages():
# https://fr.wiktionary.org/wiki/Module:langues/analyse
json_text = expand_template(
"fr", "{{#invoke:langues/analyse|affiche_langues_python}}"
)
json_text = json_text[json_text.index("{") : json_text.index("}") + 1]
json_text = json_text.replace(",\r\n}", "}") # remove tailing comma
data = json.loads(json_text)
lang_data = {}
for lang_code, lang_name in data.items():
lang_data[lang_code] = [lang_name[0].upper() + lang_name[1:]]
save_json_file(lang_data, "fr")
But in our case we don't have a function to call.
So if we try:
def get_cs_languages():
# https://cs.wiktionary.org/wiki/Modul:Languages
json_text = expand_template(
"cs", "{{#invoke:Languages}}"
)
print(json_text)
we get <strong class="error"><span class="scribunto-error" id="mw-scribunto-error-0">Chyba skriptu: Musíte uvést funkci, která se má zavolat.</span></strong> usage: get_languages.py [-h] sub_domain lang_code get_languages.py: error: the following arguments are required: sub_domain, lang_code. (Translated as "You have to specify a function you want to call. But when you enter a function name as a parameter like in the French example, it complains that that function does not exist.)
What could be a way to solve this?

The easiest and most general way is to get the return value of the module as JSON and parse it in Python.
Make another module that exports a function dump_as_json that takes the name of the first module as a frame argument and returns the first module as JSON. In Python, expand {{#invoke:json module|dump_as_json|Module:module to dump}} using the expandtemplates API and parse the return value of the module invocation as JSON with json.loads(data["expandtemplates"]["wikitext"]).
Text of Module:json module (call it what you want):
return {
dump_as_json = function(frame)
local module_name = frame.args[0]
local json_encode = mw.text.jsonEncode
-- json_encode = require "Module:JSON".toJSON
return json_encode(require(module_name))
end
}
With pywikibot:
from pywikibot import Site
site = Site(code="cs", fam="wiktionary")
languages = json.loads(site.expand_text("{{#invoke:json module|dump_as_json|Module:module to dump}}")
If you get the error Lua error: Cannot pass circular reference to PHP, this means that at least one of the tables in Module:module to dump is referenced by another table more than once, like if the module was
local t = {}
return { t, t }
To handle these tables, you will have to get a pure-Lua JSON encoder function to replace mw.text.jsonEncode, like the toJSON function from Module:JSON on English Wiktionary.
One warning about this method that is not relevant for the module you are trying to get: string values in the JSON will only be accurate if they were NFC-normalized valid UTF-8 with no special ASCII control codes (U+0000-U+001F excluding tab U+0009 and LF U+000A) when they were returned from Module:module to dump. As on a wiki page, the expandtemplates API will replace ASCII control codes and invalid UTF-8 with the U+FFFD character, and will NFC-normalize everything else. That is, "\1\128e" .. mw.ustring.char(0x0301) would be modified to the equivalent of mw.ustring.char(0xFFFD, 0xFFFD, 0x00E9). This doesn't matter in most cases (like if the table contains readable text), but if it did matter, the JSON-encoding module would have to output JSON escapes for non-NFC character sequences and ASCII control codes and find some way to encode invalid UTF-8.
If, like the module you are dumping, Module:module to dump is a pure table of literal values with no references to other modules or to Scribunto-only global values, you could also get its raw wikitext with the Revisions API and parse it in Lua on your machine and pass it to Python. I think there is a Python extension that allows you to directly use a Lua state in Python.
Running a module with dependencies on the local machine is not possible unless you go to the trouble of setting up the full Scribunto environment on your machine, and figuring out a way to download the module dependencies and make them available to the Lua state. I have sort of done this myself, but it isn't necessary for your use case.

How to Detect URL in text and convert it to link with python? [duplicate]

I am using the function below to match URLs inside a given text and replace them for HTML links. The regular expression is working great, but currently I am only replacing the first match.
How I can replace all the URL? I guess I should be using the exec command, but I did not really figure how to do it.
function replaceURLWithHTMLLinks(text) {
var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/i;
return text.replace(exp,"<a href='$1'>$1</a>");
}

First off, rolling your own regexp to parse URLs is a terrible idea. You must imagine this is a common enough problem that someone has written, debugged and tested a library for it, according to the RFCs. URIs are complex - check out the code for URL parsing in Node.js and the Wikipedia page on URI schemes.
There are a ton of edge cases when it comes to parsing URLs: international domain names, actual (.museum) vs. nonexistent (.etc) TLDs, weird punctuation including parentheses, punctuation at the end of the URL, IPV6 hostnames etc.
I've looked at a ton of libraries, and there are a few worth using despite some downsides:
Soapbox's linkify has seen some serious effort put into it, and a major refactor in June 2015 removed the jQuery dependency. It still has issues with IDNs.
AnchorMe is a newcomer that claims to be faster and leaner. Some IDN issues as well.
Autolinker.js lists features very specifically (e.g. "Will properly handle HTML input. The utility will not change the href attribute inside anchor () tags"). I'll thrown some tests at it when a demo becomes available.
Libraries that I've disqualified quickly for this task:
Django's urlize didn't handle certain TLDs properly (here is the official list of valid TLDs. No demo.
autolink-js wouldn't detect "www.google.com" without http://, so it's not quite suitable for autolinking "casual URLs" (without a scheme/protocol) found in plain text.
Ben Alman's linkify hasn't been maintained since 2009.
If you insist on a regular expression, the most comprehensive is the URL regexp from Component, though it will falsely detect some non-existent two-letter TLDs by looking at it.

Replacing URLs with links (Answer to the General Problem)
The regular expression in the question misses a lot of edge cases. When detecting URLs, it's always better to use a specialized library that handles international domain names, new TLDs like .museum, parentheses and other punctuation within and at the end of the URL, and many other edge cases. See the Jeff Atwood's blog post The Problem With URLs for an explanation of some of the other issues.
The best summary of URL matching libraries is in Dan Dascalescu's Answer
(as of Feb 2014)
"Make a regular expression replace more than one match" (Answer to the specific problem)
Add a "g" to the end of the regular expression to enable global matching:
/ig;
But that only fixes the problem in the question where the regular expression was only replacing the first match. Do not use that code.

I've made some small modifications to Travis's code (just to avoid any unnecessary redeclaration - but it's working great for my needs, so nice job!):
function linkify(inputText) {
var replacedText, replacePattern1, replacePattern2, replacePattern3;
//URLs starting with http://, https://, or ftp://
replacePattern1 = /(\b(https?|ftp):\/\/[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/gim;
replacedText = inputText.replace(replacePattern1, '$1');
//URLs starting with "www." (without // before it, or it'd re-link the ones done above).
replacePattern2 = /(^|[^\/])(www\.[\S]+(\b|$))/gim;
replacedText = replacedText.replace(replacePattern2, '$1$2');
//Change email addresses to mailto:: links.
replacePattern3 = /(([a-zA-Z0-9\-\_\.])+#[a-zA-Z\_]+?(\.[a-zA-Z]{2,6})+)/gim;
replacedText = replacedText.replace(replacePattern3, '$1');
return replacedText;
}

Made some optimizations to Travis' Linkify() code above. I also fixed a bug where email addresses with subdomain type formats would not be matched (i.e. example#domain.co.uk).
In addition, I changed the implementation to prototype the String class so that items can be matched like so:
var text = 'address#example.com';
text.linkify();
'http://stackoverflow.com/'.linkify();
Anyway, here's the script:
if(!String.linkify) {
String.prototype.linkify = function() {
// http://, https://, ftp://
var urlPattern = /\b(?:https?|ftp):\/\/[a-z0-9-+&##\/%?=~_|!:,.;]*[a-z0-9-+&##\/%=~_|]/gim;
// www. sans http:// or https://
var pseudoUrlPattern = /(^|[^\/])(www\.[\S]+(\b|$))/gim;
// Email addresses
var emailAddressPattern = /[\w.]+#[a-zA-Z_-]+?(?:\.[a-zA-Z]{2,6})+/gim;
return this
.replace(urlPattern, '$&')
.replace(pseudoUrlPattern, '$1$2')
.replace(emailAddressPattern, '$&');
};
}

Thanks, this was very helpful. I also wanted something that would link things that looked like a URL -- as a basic requirement, it'd link something like www.yahoo.com, even if the http:// protocol prefix was not present. So basically, if "www." is present, it'll link it and assume it's http://. I also wanted emails to turn into mailto: links. EXAMPLE: www.yahoo.com would be converted to www.yahoo.com
Here's the code I ended up with (combination of code from this page and other stuff I found online, and other stuff I did on my own):
function Linkify(inputText) {
//URLs starting with http://, https://, or ftp://
var replacePattern1 = /(\b(https?|ftp):\/\/[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/gim;
var replacedText = inputText.replace(replacePattern1, '$1');
//URLs starting with www. (without // before it, or it'd re-link the ones done above)
var replacePattern2 = /(^|[^\/])(www\.[\S]+(\b|$))/gim;
var replacedText = replacedText.replace(replacePattern2, '$1$2');
//Change email addresses to mailto:: links
var replacePattern3 = /(\w+#[a-zA-Z_]+?\.[a-zA-Z]{2,6})/gim;
var replacedText = replacedText.replace(replacePattern3, '$1');
return replacedText
}
In the 2nd replace, the (^|[^/]) part is only replacing www.whatever.com if it's not already prefixed by // -- to avoid double-linking if a URL was already linked in the first replace. Also, it's possible that www.whatever.com might be at the beginning of the string, which is the first "or" condition in that part of the regex.
This could be integrated as a jQuery plugin as Jesse P illustrated above -- but I specifically wanted a regular function that wasn't acting on an existing DOM element, because I'm taking text I have and then adding it to the DOM, and I want the text to be "linkified" before I add it, so I pass the text through this function. Works great.

Identifying URLs is tricky because they are often surrounded by punctuation marks and because users frequently do not use the full form of the URL. Many JavaScript functions exist for replacing URLs with hyperlinks, but I was unable to find one that works as well as the urlize filter in the Python-based web framework Django. I therefore ported Django's urlize function to JavaScript:
https://github.com/ljosa/urlize.js
An example:
urlize('Go to SO (stackoverflow.com) and ask. <grin>',
{nofollow: true, autoescape: true})
=> "Go to SO (stackoverflow.com) and ask. <grin>"
The second argument, if true, causes rel="nofollow" to be inserted. The third argument, if true, escapes characters that have special meaning in HTML. See the README file.

I searched on google for anything newer and ran across this one:
$('p').each(function(){
$(this).html( $(this).html().replace(/((http|https|ftp):\/\/[\w?=&.\/-;#~%-]+(?![\w\s?&.\/;#~%"=-]*>))/g, '$1 ') );
});
demo: http://jsfiddle.net/kachibito/hEgvc/1/
Works really well for normal links.

I made a change to Roshambo String.linkify() to the emailAddressPattern to recognize aaa.bbb.#ccc.ddd addresses
if(!String.linkify) {
String.prototype.linkify = function() {
// http://, https://, ftp://
var urlPattern = /\b(?:https?|ftp):\/\/[a-z0-9-+&##\/%?=~_|!:,.;]*[a-z0-9-+&##\/%=~_|]/gim;
// www. sans http:// or https://
var pseudoUrlPattern = /(^|[^\/])(www\.[\S]+(\b|$))/gim;
// Email addresses *** here I've changed the expression ***
var emailAddressPattern = /(([a-zA-Z0-9_\-\.]+)#[a-zA-Z_]+?(?:\.[a-zA-Z]{2,6}))+/gim;
return this
.replace(urlPattern, '<a target="_blank" href="$&">$&</a>')
.replace(pseudoUrlPattern, '$1<a target="_blank" href="http://$2">$2</a>')
.replace(emailAddressPattern, '<a target="_blank" href="mailto:$1">$1</a>');
};
}

/**
* Convert URLs in a string to anchor buttons
* #param {!string} string
* #returns {!string}
*/
function URLify(string){
var urls = string.match(/(((ftp|https?):\/\/)[\-\w#:%_\+.~#?,&\/\/=]+)/g);
if (urls) {
urls.forEach(function (url) {
string = string.replace(url, '<a target="_blank" href="' + url + '">' + url + "</a>");
});
}
return string.replace("(", "<br/>(");
}
simple example

The best script to do this:
http://benalman.com/projects/javascript-linkify-process-lin/

This solution works like many of the others, and in fact uses the same regex as one of them, however in stead of returning a HTML String this will return a document fragment containing the A element and any applicable text nodes.
function make_link(string) {
var words = string.split(' '),
ret = document.createDocumentFragment();
for (var i = 0, l = words.length; i < l; i++) {
if (words[i].match(/[-a-zA-Z0-9#:%_\+.~#?&//=]{2,256}\.[a-z]{2,4}\b(\/[-a-zA-Z0-9#:%_\+.~#?&//=]*)?/gi)) {
var elm = document.createElement('a');
elm.href = words[i];
elm.textContent = words[i];
if (ret.childNodes.length > 0) {
ret.lastChild.textContent += ' ';
}
ret.appendChild(elm);
} else {
if (ret.lastChild && ret.lastChild.nodeType === 3) {
ret.lastChild.textContent += ' ' + words[i];
} else {
ret.appendChild(document.createTextNode(' ' + words[i]));
}
}
}
return ret;
}
There are some caveats, namely with older IE and textContent support.
here is a demo.

If you need to show shorter link (only domain), but with same long URL, you can try my modification of Sam Hasler's code version posted above
function replaceURLWithHTMLLinks(text) {
var exp = /(\b(https?|ftp|file):\/\/([-A-Z0-9+&##%?=~_|!:,.;]*)([-A-Z0-9+&##%?\/=~_|!:,.;]*)[-A-Z0-9+&##\/%=~_|])/ig;
return text.replace(exp, "<a href='$1' target='_blank'>$3</a>");
}

Reg Ex:
/(\b((https?|ftp|file):\/\/|(www))[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|]*)/ig
function UriphiMe(text) {
var exp = /(\b((https?|ftp|file):\/\/|(www))[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|]*)/ig;
return text.replace(exp,"<a href='$1'>$1</a>");
}
Below are some tested string:
Find me on to www.google.com
www
Find me on to www.http://www.com
Follow me on : http://www.nishantwork.wordpress.com
http://www.nishantwork.wordpress.com
Follow me on : http://www.nishantwork.wordpress.com
https://stackoverflow.com/users/430803/nishant
Note: If you don't want to pass www as valid one just use below reg ex:
/(\b((https?|ftp|file):\/\/|(www))[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/ig

The warnings about URI complexity should be noted, but the simple answer to your question is:
To replace every match you need to add the /g flag to the end of the RegEx:
/(\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/gi

Try the below function :
function anchorify(text){
var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/ig;
var text1=text.replace(exp, "<a href='$1'>$1</a>");
var exp2 =/(^|[^\/])(www\.[\S]+(\b|$))/gim;
return text1.replace(exp2, '$1<a target="_blank" href="http://$2">$2</a>');
}
alert(anchorify("Hola amigo! https://www.sharda.ac.in/academics/"));

Keep it simple! Say what you cannot have, rather than what you can have :)
As mentioned above, URLs can be quite complex, especially after the '?', and not all of them start with a 'www.' e.g. maps.bing.com/something?key=!"£$%^*()&lat=65&lon&lon=20
So, rather than have a complex regex that wont meet all edge cases, and will be hard to maintain, how about this much simpler one, which works well for me in practise.
Match
http(s):// (anything but a space)+
www. (anything but a space)+
Where 'anything' is [^'"<>\s]
... basically a greedy match, carrying on to you meet a space, quote, angle bracket, or end of line
Also:
Remember to check that it is not already in URL format, e.g. the text contains href="..." or src="..."
Add ref=nofollow (if appropriate)
This solution isn't as "good" as the libraries mentioned above, but is much simpler, and works well in practise.
if html.match( /(href)|(src)/i )) {
return html; // text already has a hyper link in it
}
html = html.replace(
/\b(https?:\/\/[^\s\(\)\'\"\<\>]+)/ig,
"<a ref='nofollow' href='$1'>$1</a>"
);
html = html.replace(
/\s(www\.[^\s\(\)\'\"\<\>]+)/ig,
"<a ref='nofollow' href='http://$1'>$1</a>"
);
html = html.replace(
/^(www\.[^\s\(\)\'\"\<\>]+)/ig,
"<a ref='nofollow' href='http://$1'>$1</a>"
);
return html;

Correct URL detection with international domains & astral characters support is not trivial thing. linkify-it library builds regex from many conditions, and final size is about 6 kilobytes :) . It's more accurate than all libs, currently referenced in accepted answer.
See linkify-it demo to check live all edge cases and test your ones.
If you need to linkify HTML source, you should parse it first, and iterate each text token separately.

I've wrote yet another JavaScript library, it might be better for you since it's very sensitive with the least possible false positives, fast and small in size. I'm currently actively maintaining it so please do test it in the demo page and see how it would work for you.
link: https://github.com/alexcorvi/anchorme.js

I had to do the opposite, and make html links into just the URL, but I modified your regex and it works like a charm, thanks :)
var exp = /<a\s.*href=['"](\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])['"].*>.*<\/a>/ig;
source = source.replace(exp,"$1");

The e-mail detection in Travitron's answer above did not work for me, so I extended/replaced it with the following (C# code).
// Change e-mail addresses to mailto: links.
const RegexOptions o = RegexOptions.Multiline | RegexOptions.IgnoreCase;
const string pat3 = #"([a-zA-Z0-9_\-\.]+)#([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,6})";
const string rep3 = #"$1#$2.$3";
text = Regex.Replace(text, pat3, rep3, o);
This allows for e-mail addresses like "firstname.secondname#one.two.three.co.uk".

After input from several sources I've now a solution that works well. It had to do with writing your own replacement code.
Answer.
Fiddle.
function replaceURLWithHTMLLinks(text) {
var re = /(\(.*?)?\b((?:https?|ftp|file):\/\/[-a-z0-9+&##\/%?=~_()|!:,.;]*[-a-z0-9+&##\/%=~_()|])/ig;
return text.replace(re, function(match, lParens, url) {
var rParens = '';
lParens = lParens || '';
// Try to strip the same number of right parens from url
// as there are left parens. Here, lParenCounter must be
// a RegExp object. You cannot use a literal
// while (/\(/g.exec(lParens)) { ... }
// because an object is needed to store the lastIndex state.
var lParenCounter = /\(/g;
while (lParenCounter.exec(lParens)) {
var m;
// We want m[1] to be greedy, unless a period precedes the
// right parenthesis. These tests cannot be simplified as
// /(.*)(\.?\).*)/.exec(url)
// because if (.*) is greedy then \.? never gets a chance.
if (m = /(.*)(\.\).*)/.exec(url) ||
/(.*)(\).*)/.exec(url)) {
url = m[1];
rParens = m[2] + rParens;
}
}
return lParens + "<a href='" + url + "'>" + url + "</a>" + rParens;
});
}

Here's my solution:
var content = "Visit https://wwww.google.com or watch this video: https://www.youtube.com/watch?v=0T4DQYgsazo and news at http://www.bbc.com";
content = replaceUrlsWithLinks(content, "http://");
content = replaceUrlsWithLinks(content, "https://");
function replaceUrlsWithLinks(content, protocol) {
var startPos = 0;
var s = 0;
while (s < content.length) {
startPos = content.indexOf(protocol, s);
if (startPos < 0)
return content;
let endPos = content.indexOf(" ", startPos + 1);
if (endPos < 0)
endPos = content.length;
let url = content.substr(startPos, endPos - startPos);
if (url.endsWith(".") || url.endsWith("?") || url.endsWith(",")) {
url = url.substr(0, url.length - 1);
endPos--;
}
if (ROOTNS.utils.stringsHelper.validUrl(url)) {
let link = "<a href='" + url + "'>" + url + "</a>";
content = content.substr(0, startPos) + link + content.substr(endPos);
s = startPos + link.length;
} else {
s = endPos + 1;
}
}
return content;
}
function validUrl(url) {
try {
new URL(url);
return true;
} catch (e) {
return false;
}
}

Try Below Solution
function replaceLinkClickableLink(url = '') {
let pattern = new RegExp('^(https?:\\/\\/)?'+
'((([a-z\\d]([a-z\\d-]*[a-z\\d])*)\\.?)+[a-z]{2,}|'+
'((\\d{1,3}\\.){3}\\d{1,3}))'+
'(\\:\\d+)?(\\/[-a-z\\d%_.~+]*)*'+
'(\\?[;&a-z\\d%_.~+=-]*)?'+
'(\\#[-a-z\\d_]*)?$','i');
let isUrl = pattern.test(url);
if (isUrl) {
return `${url}`;
}
return url;
}

Replace URLs in text with HTML links, ignore the URLs within a href/pre tag.
https://github.com/JimLiu/auto-link

worked for me :
var urlRegex =/(\b((https?|ftp|file):\/\/)?((([a-z\d]([a-z\d-]*[a-z\d])*)\.)+[a-z]{2,}|((\d{1,3}\.){3}\d{1,3}))(\:\d+)?(\/[-a-z\d%_.~+]*)*(\?[;&a-z\d%_.~+=-]*)?(\#[-a-z\d_]*)?)/ig;
return text.replace(urlRegex, function(url) {
var newUrl = url.indexOf("http") === -1 ? "http://" + url : url;
return '' + url + '';
});

Checking word inside a string in Coregraphe

I have to check with a Python box if a word I've told to Pepper (saved externally from a dialog box) is inside a list (created as string,and saved into ALMemory from SSH in Matlab), and do something if yes or not.
How can I do this?
def onInput_onStart(self):
#self.onStopped() #activate the output of the box
picklist = ALProxy("ALMemory")
list=picklist.getData("myFood")
def food(self):
if food in list:
tts=ALProxy("ALDialog")
tts.say("Available")

I personally would just manage it on the web using js, when it comes to this kind of stuff boxes give more trouble than it's worth. Raise an event with the string you want and check if the word is inside the list. After that you can either use the tts (as you seem to be trying to do) or raise and event (sending the true/false as a parameter) and use it to trigger whatever you want.
Javascript:
session = null
QiSession(connected, disconnected, location.host);
tts = null;
function connected(s) {
console.log("Session connected");
session = s;
startSubscribe();
session.service("ALTextToSpeech").then(function (t) {
tts = t;
});
}
function disconnected(error) {
console.log("Session disconnected");
}
function startSubscribe() {
session.service("ALMemory").then(function (memory) {
memory.subscriber("toTablet").then(function (subscriber) {
subscriber.signal.connect(functionThatChecks)
});
});
}
function functionThatChecks(word)
{
tts.stopAll();
/*Check if exists*/
tts.say("It exists"); //Or raise an event
}
Dialog
u: (word) $eventName="word"
Choregraphe

You need to use self.list before other functions can access it.
You also need to pass users_food to the function when you call food().
Assuming that list is a list of strings, and users_food is a string.
def onInput_onStart(self):
#self.onStopped() #activate the output of the box
picklist = ALProxy("ALMemory")
self.list=picklist.getData("myFood")
def food(self, users_food):
if users_food in self.list:
tts=ALProxy("ALDialog")
tts.say("Available")

Get and show json data from website's APIs in Delphi XE

I am trying to re-write a piece of code I wrote in Python to Delphi.
The Python code is:
url = "https://www.bitstamp.net/api/ticker/"
response = urllib.urlopen(url)
data = json.loads(response.read())
lastvalue = data['last']
And this is enough to assign to the variable called "lastvalue" the value that I get from bitstamp's API.
I would like to do the same thing with delphi (I am using delphi XE6). I tried to find some answer here, and I am able to connect to the bitstamp's website and to get the full string, by doing this:
function GetURLAsString(const aurl: string): string;
var
lHTTP: TIdHTTP;
begin
lHTTP := TIdHTTP.Create(nil);
try
lHTTP.IOHandler := TIdSSLIOHandlerSocketOpenSSL.Create(lHTTP);
Result := lHTTP.Get(aURL);
finally
lHTTP.Free;
end;
end;
And then I call this function with this:
procedure TForm2.Button1Click(Sender: TObject);
var
mydata : string;
begin
mydata := GetURLAsString('https://www.bitstamp.net/api/ticker/');
Label1.Text := mydata;
end;
I'm stuck here. I searched a lot but I am not able to figure out how can I assign to Label1.Text just the value assigned to "last".
When I run this I get {"high": "629.40", "last": "622.00", "timestamp": "1401544416", "bid": "621.99", "vwap": "617.47", "volume": "15147.30475739", "low": "602.26", "ask": "622.00"} assigned to Label1.Text.
I hope I was able to explain the question. I am really stuck in this point for some days, I hope someone can help me.

You can use the DBXJSON unit to parse the JSON response.
Try this sample
var
LJsonObj : TJSONObject;
LJsonValue : TJSONValue;
begin
mydata := GetURLAsString('https://www.bitstamp.net/api/ticker/');
LJsonObj := TJSONObject.ParseJSONValue(TEncoding.Default.GetBytes(mydata),0) as TJSONObject;
try
LJsonValue := LJsonObj.Get('last').JsonValue;
Label1.Text:= LJsonValue.Value;
finally
LJsonObj.Free;
end;
end;

With the SuperObject free open source JSON parser, the code would be:
var
Mydata: string;
MyObject: ISuperObject;
Last: string;
begin
... // perform GET request and store response in Mydata
MyObject := SO(Mydata);
Last := MyObject.S['last'];
...
or shorter:
// get the JSON web resource content
Mydata := GetURLAsString('...');
// get the value of the 'last' property of the object
Last := SO(Mydata).S['last'];

Eclipse smart quotes - like in Textmate

Happy Friday — Does anyone know if eclipse has the notion of smart quotes like Textmate. The way it works is to select some words and quote them by simply hitting the " key? I'm a newbie here so be gentle. FWIW - I'm using pydev in Eclipse.
Thanks
Rephrase
What I am looking for is given I have a word or phrase selected on the screen, I would like to simply hit the '"' key and have the entire word or phrase enclosed by quotes. The same would apply to various keys — like ([{"'`.
So say I have the following code
a = {}
a[keyword] = 1
Now (in python) keyword should be in quotes. I should be able to double click (select) keyword and simply type the ' and then viola the whole word is quoted. Right now what happens is that keyword is replaced by a single quote... Sigh..
Thanks

For Java and XML files you can create a new template in Window / Preferences / Java / Editor / Templates. The template text could look something like this:
"${word_selection}${}"${cursor}
Then you can apply this template activate code completion using a standard Ctrl-Space (may have to hit it 2 or 3 times to get to the template selector) and then select your quote template.

In the latest PyDev, it should work exactly as you want already (tested in PyDev 2.2.3 -- this was actually around for some time already).

I think I know what you are asking, is it that...
if you press X-key it will select the current word that the cursor is in?
If that is the question, then I don't think so.
There are lots of possible keybinding that are not set in eclipse. See Window > Preferences > General > Keys
Update:
Sorry I don't think there is a action to do this in eclipse.
A plugin may exist that you can attach to a key binding, but I'm not aware of one.

You might check out how one of the comment commands work. For example, if I select say 4 lines of code and I want to line comment all of them I can simply select them then hit ctrl+/ and all of the selected lines of code will be commented.
I'm a long time textmate user and I'm missing it something terrible. I forced myself to make a hard switch away from my mac. I'll investigate as time permits but I can't keep getting stuck on minor tweaks at the moment.
-Matt

Here is one written in Autohotkey:
#NoEnv
SetWorkingDir %A_ScriptDir%
SendMode Input
#InstallKeybdHook
#UseHook On
(::
if GetKeyState("ScrollLock","T")
{
sel := GetSelection(1)
if sel
PasteText("(" sel ")")
else
Send (
sel := ""
}
else
Send (
Return
"::
if GetKeyState("ScrollLock","T")
{
sel := GetSelection(1)
if sel
PasteText("""" sel """")
else
Send "
sel := ""
}
else
Send "
Return
'::
if GetKeyState("ScrollLock","T")
{
sel := GetSelection(1)
if sel
PasteText("'" sel "'")
else
Send '
sel := ""
}
else
Send '
Return
{::
if GetKeyState("ScrollLock","T")
{
sel := GetSelection(1)
if sel
PasteText("{" sel "}")
else
Send {{}}
sel := ""
}
else
SendRaw {
Return
[::
if GetKeyState("ScrollLock","T")
{
sel := GetSelection(1)
if sel
PasteText("[" sel "]")
else
Send [
sel := ""
}
else
Send [
Return
<::
if GetKeyState("ScrollLock","T")
{
sel := GetSelection(1)
if sel
PasteText("<" sel ">")
else
Send <
sel := ""
}
else
Send <
Return
GetSelection(wait = "")
{
ClipBack := ClipboardAll
Clipboard := ""
Send ^c
if wait
ClipWait 0.05
Selection := Clipboard
Clipboard := ClipBack
Return Selection
}
After installing Autohotkey, save this code to a text file, rename the extension to .ahk and run it. It requires the Scroll Lock to be turned on for the code to work.
This code is modified from http://www.autohotkey.net/~Vifon/ to:
Include ' and <
Write ', ", <, {, [, ( instead of '', "", <>, {}, [], () when no text is selected.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.