We have lots of data and some charts repesenting one logical item. Charts and data is stored in various files. As a result, most users can easily access and re-use the information in their applications.
However, this not exactly a good way of storing data. Amongst other reasons, charts belong to some data, the charts and data have some meta-information that is not reflected in the file system, there are a lot of files, etc.
Ideally, we want
one big "file" that can store all
information (text, data and charts)
the "file" is human readable,
portable and accessible by
non-technical users
allows typical office applications
like MS Word or MS Excel to extract
text, data and charts easily.
light-weight, easy solution. Quick
and dirty is sufficient. Not many
users.
I am happy to use some scripting language like Python to generate the "file", third-party tools (ideally free as in beer), and everything that you find on a typical Windows-centric office computer.
Some ideas that we currently ponder:
using VB or pywin32 to script MS Word or Excel
creating html and publish it on a RESTful web server
Could you expand on the ideas above? Do you have any other ideas? What should we consider?
I can only agree with Reef on the general concepts he presented:
You will almost certainly prefer the data in a database than in a single large file
You should not worry that the data is not directly manipulated by users because as Reef mentioned, it can only go wrong. And you would be suprised at how ugly it can get
Concerning the usage of MS Office integration tools I disagree with Reef. You can quite easily create an ActiveX Server (in Python if you like) that is accessible from the MS Office suite. As long as you have a solid infrastructure that allows some sort of file share, you could use that shared area to keep your code. I guess the mess Reef was talking about mostly is about keeping users' versions of your extract/import code in sync. If you do not use some sort of shared repository (a simple shared folder) or if your infrastructure fails you often so that the shared folder becomes unavailable you will be in great pain. Note what is also somewhat painful if you do not have the appropriate tools but deal with many users: The ActiveX Server is best registered on each machine.
So.. I just said MS Office integration is very doable. But whether it is the best thing to do is a different matter. I strongly believe you will serve your users better if you build a web-site that handles their data for them. This sort of tool however almost certainly becomes an "ongoing project". Often, even as an "ongoing project", the time saved by your users could still make it worth it. But sometimes, strategically, you want to give your users a poorer experience to control project costs. In that case the ActiveX Server I mentioned could be what you want.
Instead of using one big file, You should use a database. Yes, You can store various types of files like gifs in the database if You like to.
The file would not be human readable or accessible by non-technical users, but this is good.
The database would have a website that Your non-technical users would use to insert, update and get data from. They would be able to display it on the page or export it to csv (or even xls - it's not that hard, I've seen some csv->xls converters). You could look into some open standard document formats, I think it should be quite easy to output data with in it. Do not try to output in "doc" format (but You could try "docx"). You should be able to easily teach the users how to export their data to a CSV and upload it to the site, or they could use the web interface to insert the data if they like to.
If You will allow Your users to mess with the raw data, they will break it (i have tried that, You have no idea how those guys could do that). The only way to prevent it is to make a web form that only allows them to perform certain actions that You exactly know how that they should suppose to perform.
The database + web page solution is the good one. Using VB or pywin32 to script MSOffice will get You in so much trouble I cannot even imagine.
You could use gnuplot or some other graphics library to draw (pretty straightforward to implement, it does all the hard work for You).
I am afraid that the "quick" and dirty solution is tempting, but I only can say one thing: it will not be quick. In a few weeks You will find that hacking around with MSOffice scripting is messy, buggy and unreliable and the non-technical guys will hate it and say that in other companies they used to have a simple web panel that did that. Then You will find that You will not be able to ask about the scripting because everyone uses the web interfaces nowadays, as they are quite easy to implement and maintain.
This is not a small project, it's a medium sized one, You need to remember this while writing it. It will take some time to do it and test it and You will have to add new features as the non-technical guys will start using it. I knew some passionate php teenagers who would be able to write this panel in a week, but as I understand You have some better resources so I hope You will come with a really reliable, modular, extensible solution with good usability and happy users.
Good luck!
Related
I work in Python and have to generate spreadsheets frequently to share my data with programming-naive colleagues. I embed large blocks of text explaining the contents of the spreadsheet and how it was generated into the first page of these spreadsheets routinely. I don't like relying on an associated document to explain definitions, criteria, algorithms, and reliability when I send my results out into the world.
It's really awkward to edit and store the long strings that make up these blocks of text. I'd love to store them in dedicated files that I can work with using a tool INTENDED to edit large blocks of text. I'm wondering how other people deal with this kind of situation. JSON files? YAML? Some obvious built-in functionality in Python I don't know about?
This is obviously a very open-ended question. I'm sure there a lot of different approaches and solutions out there. It's a difficult thing to search for online as there are a lot of obfuscating factors when you search for things like 'python large strings' or 'python text files'. I'm hoping to hear about a number of different approaches.
This is my first post on stack.
I'm looking to gather a large amount of data from a multitude of files on PW so I can quantify a few things about the records.
The directories I'm working with have unique numbers and offer files that are all similar to files in other folders.
Is there a library from python I can use or any other useful tips for taking on this task?
It could potentially save many hours of work if I can do this with code.
A pseudocode example may look like.
for element in dataField:
search(folder)
if folder found:
search(file)
if file found
extract certain data from file X
extractedData.append(data)
Thank you,
R
Based off a quick web search for projectwise api, there is a web-based REST API available, so you'll definitely want to look into that more. You'll need to read the docs carefully to figure out which endpoint does what, but once you know what information you need to send and what kind of data you'll receive, programming a basic Python interface shouldn't be too difficult. One may already exist, I didn't look too hard.
Firstly, apologies for the very basic question. I have looked into other answers but they haven't quite answered what I'm after. I'm confident designing a site in HTML/CSS and have very very basic knowledge of Python.
I want to run a very basic Python script on my website. It analyses tweets about a specific topic, and then posts a sentiment analysis score. I want it to run this sentiment analysis every hour and cache the score.
I have a working Python script which does this in Jupyter Notebook. Could you give me an overview of how I would make this script function online and cache the results? I've read into using Python web frameworks, but from my limited understanding, they seem like overkill?
Thank you for your help!
Could you give me an overview of how I would make this script function online
The key thing would be to uncouple the two parts of your system:
Producing the data
Showing it in a website.
So the first thing to do is have your sentiment-analysis script push its value to a database. The database could be something as simple as a csv file, or it could be a key/value store, or something like MySQL or CouchDB (or hundreds of other choices).
Over on the website you have to make a decision between:
Server-side
Client-side
If the former, you could program in Python if that is what you are most familiar with. Whatever language/framework combination you go for, there will an example tutorial of how to read a value from a database and display it: it is just about the most fundamental thing.
If client-side you will usually be programming in JavaScript. Again you need to choose a framework, but again you should easily be able to find a tutorial to follow.
(Unless you have a good reason to prefer server-side, such as familiarity with an existing framework, or security issues with accessing your database, I'd go with a client-side approach.)
I've read into using Python web frameworks... overkill?
Yes and no. You are going to need some kind of database, and some kind of framework. It would be good to understand the basics of web security, too. If the sentiment analysis is your major goal, all that is going to be a distraction, and it might be better to find a friend who already knows web programming to work with. Or just find a tutorial that is very close to what you want to do, and adapt that.
(P.S. I was going to flag your question as "too broad", but you did ask for an overview, so I hope this helps.)
I am looking for advice on how to automate submitting data directly to a Siebel application at work using Python. Currently I enter the data into an Autohotkey GUI and when a button is selected it enters the data into Siebel for me using mouse moves and mouse clicks to select the right entries for each piece of data. Obviously this is prone to errors and I would like to make the application better if possible. Using an object oriented programming language would improve this greatly. Just to clarify, this is NOT for automation testing. The data and account/page that I am submitting too changes quite often. So, modules like Selenium, Mechanize, and BeautifulSoup won't work for this as far as I can tell. Since not everything has a form or a friendly label that I can submit data to. If anyone has experience with Siebel and knows a way to copy data from and submit data directly to different entries that would be great.
Right now my best option is to use modules like Pyautogui and Pywinauto to perform mouse moves and clicks to copy what my Autohotkey script does. But this seems inefficient and potentially prone to errors. There has to be a better way to accomplish the same thing using Python. I am just not certain how and I would appreciate any advice you guys may have. Even if that is "no there is no other way" it would help me figure out what to do next. Thanks in advance!
Interacting with the Siebel CRM Application can be done in a large number of ways (SOAP, REST, COM, Java, UI, to name a few supported) and the use-case and environment typically define the preferred approach. A decent Siebel developer/consultant will be able to help you make the right choice.
The ease and available tooling for automating the UI is largely dependent on the version of Siebel you run. Prior to OpenUI this was mainly the domain of large Test Automation vendors (HP, Mercury, Oracle) and required a separate license module to be purchased.
Post Open UI the web UI itself became a single DOM object and much more suitable to automation using open source test tooling like Selenium. With the Test Automation license module activate it will also introduce additional HTML attributes that help to create stable locators.
If interacting with the the UI is just a means to change data, then I would advise an alternate approach: directly interact with the business layer. The added advantage is that there is much more information in the data objects than is typically available in a single UI screen and is more structure.
The easiest approach is probably using the web services. The older versions support mainly SOAP but the latest version also support REST. Most programming languages have support for these approaches and will allow you to import their WSDL files. Keep in mind that you're dependent on the DEV team to extend these interfaces too when they add fields to the UI.
Another approach that gives the most flexibility is to directly interact with the business layer using Java (Bean) or COM. The java approach only requires two JAR files and Google has enough examples on this approach to explain how to use it. When Python is your preferred approach, then the COM interface is an interesting approach. This GitHub project has some good examples to get you on your way.
I'm exploring many technologies, but I would like your input on which web framework would make this the easiest/ most possible. I'm currently looking to JSP/JSF/Primefaces, but I'm not sure if that is capable of this app.
Here's a basic description of the app:
Users log in with their username and password (maybe I can somehow incorporate OPENID)?
With a really nice UI, they will be presented a large list of questions specific to a certain category, for example, "Cooking". (I will manually compile this list and make it available.)
When they click on any of these questions, a little input box opens up below it to allow the user to put in a link/URL.
If the link they enter has the same question on that webpage the URL points to, they will be awarded one point. This question then disappears and gets added to a different page that has a list of all correctly linked questions.
On the right side of the screen, there will be a leaderboard with the usernames of the people with the top ten points.
The idea is relatively simple - to be able to compile links to external websites for specific questions by allowing many people to contribute.
I know I can build the UI easily with Primefaces. [B]What I'm not sure is if JSP/JSF gives the ability to parse HTML at a certain URL to see if it contains words.[/B] I can do this with python easily by using urllib, but I can't use python for web GUI building (it is very difficult). What is the best approach?
Any help would be appreciated!!! Thanks!
The best approach is whatever is best for you. If Python isn't your strength but Java is, then use Java. If you're a Python expert and know little Java, use Python.
There are so many resources on the Internet supporting so many platforms that the decision really comes down to what works best for you.
For starters, forget about JSP/JSF. This is an old combination that had many problems. Please consider Facelets/JSF. Facelets is the default templating language in the current version of JSF, while JSP is there only for backwards compatibility.
What I'm not sure is if JSP/JSF gives the ability to parse HTML at a certain URL to see if it contains words.
Yes it does, although the actual fetching of data and parsing of its content will be done by plain Java code. This itself has nothing to do with the JSF APIs.
With JSF you create a Facelet containing your UI (input fields, buttons, etc). Then still using JSF you bind this to a so-called backing bean, which is primarily a normal Java class with only one or two JSF specific annotations applied to it (e.g. #ManagedBean).
When the user enters the URL and presses some button, JSF takes care of calling some action method in your Java class (backing bean). In this action method you now have access to the URL the user entered, and from here on plain Java coding starts and JSF specifics end. You can put the code that fetches the URL and does the parsing you require in a separate helper class (separation of concerns), or at your discretion directly in the backing bean. The choice is yours.
Incidentally we had a very junior programmer at our office use JSF for something not unlike what you are requesting here and he succeeded in doing it in a short time. It thus really isn't that hard ;)
No web technology does what you want. Parsing documents found at certain urls is out of the scope of building web interfaces.
However, each of Java's web technologies will give you, without limits, access to a rich and varied (if not too rich and much too varied) set of libraries and frameworks running on JVM. You could safely say that if there is a library for doing something, there will be a Java version available. Downloading and parsing a document will not require more than what is available in the standard library (unless you insist on injecting your dependencies or crosscutting your concerns), so no problems with doing your project with JSP, or JSF/Primefaces, or whatever.
Since you claim to already know Python, and since you will have to add some HTML/CSS anyway, I suggest you try Django. It's dead simple, has a set of OpenID plugins to choose from, will give you admin interface for free (so you can prime the pump with the first set of links).