How to run CRISPR Off-Target-Predictor (CROP)? - python

I would like to check the off-target of my gRNA in the genome sequences from the species that I would like to examine. I found this method CROP
However, I really do not know how to run this code, because I have never used Python before.
Could anyone want to teach me to step by step?
I really appreciate your help.

Related

Can someone please explain librosa.hz_to_svara_h?

I am completely new to this librosa module and also to all this audio editing (Have to learn for assignment).
So my question is that can somebody explain this: librosa.hz_to_svara_h(frequencies, *, Sa, abbr=True, octave=True, unicode=True), I mean I want to know what each parameter stands for if possible in layman's term (if possible). But most importantly, I want to know how to get the frequency from an audio file (.wav format). Is it the same as MFCC or something else.
I read the documentation at https://librosa.org/doc/latest/tutorial.html but couldn't get the entire picture.
Please help.

Creating a 6 variable Venn Diagram with Boolean values in 6 columns

I am a novice at python so I apologize if this is confusing. I am trying to create a 6 variable venn diagram. I was trying to use matplotlib-venn, however the problem I am having is creating the sets is turning out to be impossible for me. My data is thousands of rows long with a unique index and each column has boolean values for each category. It looks something like this:
|A|B|C|D|E|F|
|0|0|1|0|1|1|
|1|1|0|0|0|0|
|0|0|0|1|0|0|
Ideally I'd like to make a venn diagram which would show that these # of people overlap with category A and B and C. How would I go about doing this? If anyone would be able to point me in the right direction, I'd be really grateful.
I found this person had a similiar problem with me and his solution at the end of that forum is what I'd like to end up at except with 6 variables: https://community.plotly.com/t/how-to-visualize-3-columns-with-boolean-values/36181/4
Thank you for any help!
Perhaps you might try to be more specific about your needs and what you have tried.
Making a six-set Venn diagram is not trivial at all, ever more so if you want to make the areas proportional. I made a program in C++ (nVenn) with a translation to R (nVennR) that can do that. I suppose it might be used from python, but I have never tried and I do not know if that is what you want. Also, interpreting six-set Venn diagrams is not easy, you may want to check upSet for a different kind of representation. In the meantime, I can point you to a web page I made that explains how nVenn works (link).

Python: Text localiser using Tensorflow

I would like to code a script that could locate a specific word or number in a financial statement. Financial statements roughly contain the same information, they are however not identical and organized in the same way. My thought is that by using Tensorflow I could train a neural network to locate the specific words or numbers for me. I am thinking that if I label different text and numbers in 1000 financial statements and use them to train the neural network, it will then be able to identify these numbers or words in all financial statements. For example, tell it in all 1000 training statements which number that is the profit of the company.
Is this doable? I have been working with coding in python for a couple of months and so far I've built some web scrapers and integrated them with twitter, slack and google sheets. I would be very grateful for all your thoughts on this project and if anyone could steer me in the right direction by sharing relevant tutorials.
Thanks a lot!
Great thing that you're getting started, I believe before thinking about the actual implementation using tensorflow or any other library, you should first try to understand the problem in regards with the basic domain of the problem itself.
I'm not really sure what are you exactly trying to achieve but to a rough idea I'm guessing it's about trying to find is a statement turns out to be a benificial to the company or not, something like of semantic analysis type of problem.
So I strongly believe that, first you should try to learn the various methodologies related to semantic analysis and find the most appropriate technique.
In short theory/understanding before the actual code.
Finally i would suggest you ask such theoratical questions on stack exchange of AI, here in SO we generally deal with code or something that of intermediate to code.
I hope that makes sense? ;)
drop a comment if any doubts.

Implementation of multi-variable, multi-type RNN in Python

I have a dataset which has items with the following layout/schema:
{
words: "Hi! How are you? My name is Helennastica",
ratio: 0.32,
importantNum: 382,
wordArray: ["dog", "cat", "friend"],
isItCorrect: false,
type: 2
}
where I have a lot of different types of data, including:
Arrays (of one type only, e.g an array of strings or array of numbers, never both)
Booleans
Numbers with fixed min/max (i.e on a scale of 0 to 1)
Limitless integers (any integer -∞ to ∞)
Strings, with some dictionary, some new, words
The task is to create an RNN (well, generally, a system that can quickly retrain when given one extra bit of data instead of reprocessing it all - I think an RNN is the best choice; see below for reasoning) which can use all of these factors to categorise any dataset into one of 4 categories - labelled by the type key in the above example, a number 0-3.
I have a set of lots of the examples in the above format (with answer provided), and I have a database filled with uncategorised examples. My intention is to be able to run the ML model on that set, and sort all of them into categories. The reason I need to be able to retrain quickly is because of the feedback feature: if the AI gets something wrong, any user can report it, in which case that specific JSON will be added to the dataset. Obviously, having to retrain with 1000+ JSONs just to add one extra on would take ages - if I am not mistaken, an RNN can get around this.
I have found many possible use-cases for something like this, yet I have spent literal hours browsing through Github trying to find an implementation, or some Tensorflow module/addon to make this easier/copy, but to no avail.
I assume this would not be too difficult using Tensorflow, and I understand a bit of the maths and logic behind it (but not formally educated, so I probably have gaps!), but unfortunately I have essentially no experience with using Tensorflow/any other ML frameworks (beyond copy-pasting code for some other projects). If someone could point me in the right direction in the form of a Github repo/Python framework, or even write some demo code to help solve this problem, it would be greatly appreciated. And if you're just going to correct some of my technical knowledge/tell me where I've gone horrendously wrong, I'd appreciate that feedback to (just leave it as a comment).
Thanks in advance!

Analyse beat intensity

I'm trying to work on a machine learning application, and I need to analyze a .wav file and retrieve the "intensity" of each beat.
You know when you look at the waves of a .wav file, and some beats are bigger than the other (or taller)? How do I analyse that, and get those values as numbers with timestamps?
I tried looking into the Aubio library, but I'm not sure if it'd help me a lot, it's not that well documented.
I didn't really write any code, just been researching into this, and I'd appreciate if someone could give me a step in the right direction.

Categories

Resources