Calculating number of T-units in a given sentence using Python - python

I've been working on a second language development project. I need to calculate the t-unit of a given sentence using Python. For example, for the following sentences:
The man did not like water.
1 t-unit (The man did not like water)
The man did not like water although he lived by the sea.
1 t-unit (The man did not like water although he lived by the sea)
The man never liked water and he certainly did not like living in the swamp with her grandparents.
1 t-unit (The man never liked water)
1 t-unit (he certainly did not like living in the swamp with her grandparents)
The man did not like water or juice.
1 t-unit (The man did not like water or juice)
I've checked out nltk, spacy and stanford nlp (stanza) but found out that they don't include such t-unit detection at all.
I've come across this but it is about clause extraction.
Any idea how I can detect such t-units using Python?

Related

Scraping Yelp review content displaying different tags using Beautiful Soup

I'm practicing web-scraping and trying to grab the reviews from the following page: https://www.yelp.com/biz/jajaja-plantas-mexicana-new-york-2?osq=Vegetarian+Food
This is what I have so far after inspecting the name element on the webpage:
page = requests.get('https://www.yelp.com/biz/jajaja-plantas-mexicana-new-york-2?osq=Vegetarian+Food', headers={'User-Agent':'Mozilla/5.0'}).text
parsed_page = BeautifulSoup(page, 'lxml')
# print(parsed_page)
for x in parsed_page.find_all('a', class_='css-1422juy'):
print(x)
But it doesn't seem to be working, the output is not the name but:
<a class="css-1422juy" href="/c/mexican">Mexican</a>
<a class="css-1422juy" href="/c/vegan">Vegan</a>
<a class="css-1422juy" href="/c/bars">Bars</a>
<a class="css-1422juy" href="https://www.yelp.com/menu/jajaja-plantas-mexicana-new-york-2" role="link">View full menu<span class="display--inline__09f24__c6N_k margin-l1__09f24__m8GL9 border-color--default__09f24__NPAKY"><span aria-hidden="true" class="icon--14-chevron-right-outline css-mpwjkl"><svg class="icon_svg" height="14" width="14"><path d="M5.043 11.5a.498.498 0 00.353-.146L9.75 7 5.396 2.646a.5.5 0 00-.707.708L8.336 7l-3.647 3.646a.502.502 0 00.354.854z"></path></svg></span></span></a>
<a class="css-1422juy" href="/questions/XipQLDbyTl5tsLlyzAWzug" role="link">Ask a question<span class="display--inline__09f24__c6N_k margin-l1__09f24__m8GL9 border-color--default__09f24__NPAKY"><span aria-hidden="true" class="icon--24-add-v2 css-106vfgv"><svg class="icon_svg" height="24" width="24"><path d="M19 11h-6V5a1 1 0 10-2 0v6H5a1 1 0 100 2h6v6a1 1 0 102 0v-6h6a1 1 0 100-2z"></path></svg></span></span></a>
If I use html.parser and .prettify() instead the parsed output of the name, rating, and review fields in the the console looks pretty different:
<script type="application/ld+json">
{"#context":"https://schema.org","#type":"Restaurant","name":"Jajaja Plantas Mexicana","image":"https://s3-media0.fl.yelpcdn.com/bphoto/OkWKXxOZBLJO7hRjOlIMig/l.jpg","priceRange":"$11-30","telephone":"(646) 883-5453","address":{"streetAddress":"162 E Broadway","addressLocality":"New York","addressCountry":"US","addressRegion":"NY","postalCode":"10002"},"review":[{"author":"Caroline J.","datePublished":"2021-11-24","reviewRating":{"ratingValue":5},"description":"I typically don&apos;t write reviews for restaurants, but I&apos;ll make an exception! I have been vegan for almost 4 years now and meatless for about 8 years now. My sister told me about this place and we didn&apos;t get a chance to go until last night. My sister is non-vegan, my mom is also non-vegan. I went with my sister, and we got food to share. My sister who is a very skeptical about vegan food absolutely loved it. My mom who is also skeptical about vegan food had my sisters leftovers and absolutely loved it as well! Everything was delicious and I love the atmosphere in the restaurant as well! Definitely will be going back because they have other stuff on the menu I want to try! It&apos;s both vegan and non vegan approved for a 100% vegan Mexican restaurant!!"},{"author":"Melissa W.","datePublished":"2021-12-24","reviewRating":{"ratingValue":5},"description":"The food was great. I went with my partner for a birthday dinner and while I have adopted a plant based lifestyle, he has not but he enjoyed EVERYTHING that we ordered."},{"author":"Sooji L.","datePublished":"2022-01-09","reviewRating":{"ratingValue":5},"description":"NACHOS NACHOS NACHOS ($18 with guac). I prefer having these nachos than other nachos that aren&apos;t vegan. Soooooooo gooooood. I have to say that to pay extra $3 for guac, they can add more than just a scoop. The portions are enough for 3 people! \n\nFish tacos were great. Replaced with squash and flavorful all together. $9 for two tacos. You may add an extra taco for an extra $3/4 bucks. \n\nMy husband ordered the Gorditas. The shell was so crunchy and good. It was supposed to have "bacon" but it was quite underwhelming. Wouldn&apos;t quite recommend this dish.\n\nBoth drinks were delish! Almond horchata... I would get that the next time I come back. It is a bit gritty though.\nI ordered the matcha y coconut cocktail. Cocktails are $$. It was $15. Some drinks can be turned into a mock tail for the same price which I find that to be unreasonable. Regardless, the drinks are worth getting.\n\n****ASK FOR HOT SAUCE! and you&apos;ll get three choices. I love my hot sauce and the variety they provided. The mild (orange) was my favorite. \n\n(Rating was not based on this following experience) on our visit, there was a homeless man in their outdoor seating area which we were able to view from our seat inside. At some moments, he pulled down his pants. Not pleasant, I&apos;d say. The staff did try to remove him but he did not budge. I hope they reached out to homeless services. He was there during our entire dining experience. \n\nMy rating is purely on the food and service. That situation was out of their control but do please keep in mind that the neighborhood is not the best area. They do have multiple locations so please consider checking them out."},{"author":"Roshni P.","datePublished":"2021-11-11","reviewRating":{"ratingValue":5},"description":"I was skeptical because I am from Southern California and Mexican has to hit right for me. But holy crap this is amazing!!!! \n\nBirria tacos was our least favorite but that&apos;s not saying much because I would give it 9.5/10 \n\nBurrito was 10/10 . As you eat it there is more and more flavor and so so good! I like the red sauce side more \n\nNachos is hands down the best item!!! Plenty of food for your meal. I would give it a 11/10. The chorizo is soo good. Ask for hot sauces on the side. The orange one is really good"},{"author":"Whitney L.","datePublished":"2021-11-03","reviewRating":{"ratingValue":5},"description":"This place hits the spot on so many levels.\n\n Service was attentive, efficient and personable. Check!\n\nFood---outstanding. I ordered the Tavon Taco bowl, pumpkin and beet empanada and the Matcha and Coconut Cocktail. Everything was delicious and seasoned well. If you like matcha, don&apos;t sleep on that Matcha +Coconut Cocktail....OUTSTANDING. Everything tasted truly unique---like something you can&apos;t get other places. And even better, lots of veggies and healthy options. Double- check!!\n\nTheir homemade chorizo is yum yum yum. Make sure to try that!\n\nDecor and ambiance was fun, colorful and on brand. This place has great energy. Highly recommend and can&apos;t wait to go back."},{"author":"Johnny G.","datePublished":"2021-12-27","reviewRating":{"ratingValue":4},"description":"The nachos are to die for. Unbelievable, so clean and not heavy. Couldn&apos;t tell the difference from any carnivorous nacho I&apos;ve had. I loved it and wished I stopped there.\nI had the Coconut Queso Quesadilla and wasn&apos;t really a fan. Tasted like a pasta dish (heavy on the pesto) and it had a sweetness. Nonetheless great food overall."},{"author":"Kathy X.","datePublished":"2021-09-25","reviewRating":{"ratingValue":4},"description":"Recently, I came here with a friend for lunch. I was looking forward to trying plant-based food and drink. The interior was lovely, bright, and modern. There is a fully stocked bar with plants all around, orange lights dropping down from the ceiling, and nice wood tables. \n\nFor my meal, I ordered Almond Horchata (always have to get the Horchata if it&apos;s on the menu), Crispy Chayote Fish, and Enchiladas Mole. I also got to try my friend&apos;s Chorizo Nachos and Mexican Street Corn. \n\nThe Almond Horchata came in a wine glass with a beautiful deep purple orchid flower. The color of the drink was pale brown with bits of coconut on top. It was delightfully creamy, cinnamon-y, and sweet. \n\nChayote Fish consisted of hemp and flax seed buttered squash, and the taco also came with chipotle almond butter and red onion. The squash was fried to perfection in a deliciously well-seasoned batter, and truly resembled the taste of fish. The chipotle-flavored butter added a depth of flavor to the tacos, while the red onion gave it an additional crunch and fresh element. \n\nEnchiladas Moles had shredded palm carnitas, coconut queso, guajillo, sour cream, and also came with Spanish rice. This was absolutely wonderful! The palm carnitas was tender, succulent, with plenty of flavorful spices. The sauce that accompanied the enchiladas was yummy as well. In addition, the Spanish rice was cooked well. \n\nNachos came with plant based chorizo, fermented black beans, corn, turmeric, queso fundido, and sour cream. What a mouthwatering medley of flavors that mingled together to create very tasty bites. The chorizo was juicy, succulent, and packed with bold flavor. The cheese sauce with the sour cream was addicting and was well distributed throughout all the chips. \n\nThe Mexican Street Corn was covered with a powdery type of cheese, and slightly charred (which is how I like it) and was good. \n\nService was top notch. Our waiter was friendly and kind. Everything was made with such care and attention to detail. I also loved the unique menu offerings. I would definitely like to return and try the Coconut Queso Quesadilla, and perhaps the Peanut Chocolate Cake."},{"author":"Alissa M.","datePublished":"2022-01-09","reviewRating":{"ratingValue":4},"description":"Came here for lunch with my wife on a Saturday afternoon. The restaurant is absolutely adorable inside! I love the plants and the decor all throughout. The nachos are to die for! Definitely the best nachos I have had since becoming dairy free. The coquito was also very good! The display of the chorizo burrito and barbacoa tacos was cute but I wasn&apos;t a huge fan of the texture of the barbacoa or the flavor of the protein inside the burrito. We will definitely be back for the Nachos and I must try the mini churros next time!"},{"author":"Steven Z.","datePublished":"2021-08-12","reviewRating":{"ratingValue":5},"description":"Great Mexican food with a cool interior playing Hispanic music which made for a nice ambience. Prices are fair\n\nMy friend and I were starving after a long day and ordered a lot of food: nachos, chorizo burrito, fish tacos and a quesadilla. \n\nThe nachos (with guac) were really good and made for a great appetizer, the chorizo burrito was delicious (but very messy since it&apos;s covered in sauce), the quesadilla was really good too. But the highlight of the meal was the fish tacos - nice crispy outside but soft and yummy inside. If I had to get one dish again, it&apos;d be the fish taco. \n\nThe frozen agave margarita was a refreshing as well; it&apos;s pretty strong so you&apos;ll definitely taste the alcohol."},{"author":"Lily C.","datePublished":"2021-10-30","reviewRating":{"ratingValue":5},"description":"We got the fish tacos, carnitas, nachos with guacamole, and churros! Everything was so good and we couldn&apos;t believe it was all vegan! My boyfriend and I are both meat eaters, but we noticed that we didn&apos;t feel sluggish/sleepy (as we would with our non-vegan meals) after finishing our food. Definitely get the churros if you&apos;re craving for something sweet after your meal! \n\n5/5 Highly recommend whether you&apos;re a vegan or not! :D"},{"author":"Rebecca R.","datePublished":"2021-12-26","reviewRating":{"ratingValue":5},"description":"Yummy food. Inventive and interesting. Wide variety. The nachos were great. Good sized portions. Accommodating for preferences and such. Even the kiddos loved it!"},{"author":"Shubhi M.","datePublished":"2021-07-18","reviewRating":{"ratingValue":5},"description":"I&apos;ve never tried a vegan mexican food that tasted this good. It&apos;s crazy some of the vegan taco options they had (birria, carnitas, chorizo, barbacoa, and fish), I have never had an opportunity to try vegan version of those dishes. I&apos;ve never had barbacoa or fish but I can imagine this is may somewhat taste like. It was almost a weird (but definitely good) experience how meaty the texture was of the tacos. \n\nThe nachos were beyond amazing, a pet peeve of mine is when I get served a bunch of chips with toppings only on the top and the bottom of the pile is just plain chips but these were perfectly layered with so much vegan chorizo. They recently opened a location in a whole food which is very close to me so i&apos;m definitely going to go back!"},{"author":"Likitha M.","datePublished":"2021-08-11","reviewRating":{"ratingValue":4},"description":"My friend picked out this place for us to try and I didn&apos;t realize that it was all plant-based until our waitress mentioned it. We didn&apos;t have a reservation but we were seated as soon as we walked in and given a QR code to scan for the menus. \n\nIt felt like there were so many options to choose from but we ultimately narrowed it down and ordered the nachos with added guac, chayote "fish" tacos, chorizo tacos, barbacoa tacos, and almond horchata. We LOVED the nachos -- the chips were nice and crispy and all the toppings were so cohesive. There was a really good topping/chip ratio so we never ran out of one or the other. In addition, the sauces added were delicious -- if you come by and only want one thing to order, get the nachos!! The tacos were all so flavorful and delicious as well, but so small and expensive. I personally don&apos;t think they were worth $8 for 2 tacos, especially if it only took like 3 bites to consume but I understand the pricing as it is vegan and located in NYC. Finally, the horchata was very refreshing and I loved it! \n\nOverall, they had really delicious food and it was a pretty decent experience for my first time having an all plant-based meal!"},{"author":"Fionna L.","datePublished":"2021-07-14","reviewRating":{"ratingValue":4},"description":"Link to menu: https://qrcgcustomers.s3-eu-west-1.amazonaws.com/account8682491/15091203_1.pdf?0.4352589218767642 \n\n(4.5/5) The lively, communal atmosphere made eating here a pleasure! We sat at the bar so service was slower but never neglectful. Their nachos and tacos are must-trys and great to share with friends!\n\n - Nachos: $13 + $3 guac: the BEST plant-based nachos I&apos;ve had! Their chorizo was chewy and flavorful, the fermented black beans were yum and very filling, and the guac was definitely worth it for added depth and creaminess. Small thing, but I wish they had more than just a couple red and purple nachos.\n - Chorizo Burrito: $13: too salty! We could barely taste the ingredients or differentiate the sauces because salt dominated the flavor of every bite. Quite disappointed in this one. \n - Crispy Chayote Fish Tacos (2): $8: how ingenious to use "hemp & flax seed battered squash" as a substitute for fish! The description sounds sophisticated but it tasted very much like BBQ Lay&apos;s, even down to the delicious crunch. Due to the strength of its flavorings, we couldn&apos;t taste any of the squash&apos;s original flavor. The ring of red onion was aesthetic but would have been easier to eat if chopped.\n\nSide note: their restrooms downstairs demonstrate how they turn rustic elements unique through modern and innovative design. \n\nAll in all a great experience, though I wouldn&apos;t recommend the burrito. Would love to go back to try the Coconut Queso Quesadilla and Spicy Birria Tacos!!"},{"author":"Sharon J.","datePublished":"2021-10-16","reviewRating":{"ratingValue":4},"description":"Kinda overrated food, but great vibes! Pretty good for a vegan restaurant, but I&apos;ve had better vegan food and Mexican food. \n\nI got the chorizo burrito but it was very salty with not a lot of other flavor besides that - almost as if they were using salt to cover up the lack of flavor. Definitely very pretty food though, nice interior, and great service - just wish the food was more flavorful. Maybe I&apos;ll try the tacos next time and see if those are better..."},{"author":"Srini V.","datePublished":"2021-09-24","reviewRating":{"ratingValue":5},"description":"I have been meaning to visit this location of Jajaja for some time now. And the opportunity presented itself as a few of us were looking for a neighborhood restaurant in the middle of a torrential downpour.\n\nWe chose to take a table inside. The QR code for their menu was pasted on the wall by our table. We had the guacamole y chips to start. I then got the sopa del día, which happened to be a green soup (sopa verde). This was the highlight of my meal. The soup was thick, had a generous sprinkling of hot sauce and was just perfect, especially after the rain. I then had the chipotle sweet potato street tacos that were delicious ... the fermented beans made for a hearty filling.\n\nGreat service. Nice ambiance. Everyone was happy with their food and drinks. I will be back soon!"},{"author":"Julianna Y.","datePublished":"2021-09-29","reviewRating":{"ratingValue":5},"description":"This place is DELICIOUS! I&apos;ve been ordering my lunch here for a while now, and I have to say that everything I&apos;ve had has been so good! This is a vegan Mexican heaven for foodies! Things you should order when you come here are: \n\n-Turmeric Cauliflower Rice \n-Taco Tazón bowl \n-Nachos! \n-chorizo Burrito \n\nAside from these yummy dishes, their hot sauces are amazing! Definitely recommend getting the mild (orange) and black bean sauce!"},{"author":"Saasha G.","datePublished":"2021-09-14","reviewRating":{"ratingValue":4},"description":"I have became a certified vegan food lover! Never have I tasted such great food and no meat in anything. I was astonished at the flavors. This was my first time dining here and I will return. The place is small and minimal decor but honestly I don&apos;t care just give me the food!"},{"author":"Edward K.","datePublished":"2021-11-20","reviewRating":{"ratingValue":5},"description":"My friend and I get dinner once a month, we alternate choosing and this month my friend selected Jajaja. We arrived a half hour before our 6:45 reservation on a Thursday. The gentleman at the door was able to get us seated right away.\n\nThe restaurant has a trendy vibe, cool furniture and lighting. We did get wedged into a slightly uncomfortable corner table but it may have been uncomfortable mainly because I&apos;m on the taller side.\n\nThe goal of our monthly outings is to continually challenge our palates. We&apos;ve had the vegan/vegetarian genre before but we hadn&apos;t tried vegan Mexican before. The menu is cool, lots of choices that are able to be shared. We selected 5 starters, 2 entree style dishes, and three tacos. The heart of palm ceviche was very fresh. The beet and pumpkin empanada was piping hot and flavorful. I enjoyed the tamale with jackfruit. The Gordita and Burrito entrees were pretty substantial and also tasty. We found the tacos to be a highlight, the cauliflower especially. The birria and "fish" tacos were also quite nice, the birria was perhaps the dish with the most kick and the "fish" had a good texture.\n\nService was solid throughout the meal. Although we ate a solid amount of dishes for two people, I didn&apos;t feel overstuffed. Definitely a place worth returning to."},{"author":"Pamela L.","datePublished":"2021-02-15","reviewRating":{"ratingValue":4},"description":"As I was walking to my mom&apos;s place, this new restaurant stopped me in my tracks. OMG...I remember that this spot used to be a Chinese bakery. As a kid, I used to frequent the old Golden Carriage to grab a drink after a day at the park with my best friend. When I got to my mom&apos;s I check the place out on Yelp. Asked my son if he would be interested in trying Vegan food. He frowned until I convinced him by showing him the photos.\n\nNYC just opened their indoor dining again on the 14th. Temps were taken before we were allowed to sit down. Right off the bat, let me just say that our waiter was amazing, very attentive and kept asking how we liked everything. Decor-wise...I can&apos;t believe this is the same place, it&apos;s an amazing renovation.\n\nOne thing I read and saw was that their nachos w/chorizo is a must order. When it came to the table, I was like...no way we will finish that. Ummm, it was so frickin delish that we had no trouble. The first thing my son said was...I can&apos;t tell that it&apos;s Vegan and he&apos;s a huge meat eater. Our waiter asked if we wanted hot sauce with our nachos. He came by with 3 different ones and explained the level of hottest. The most mild one reminded me of a mole. I thought that was interesting.\n\nWe all enjoyed the tacos we ordered. The portions were generous and the ingredients fresh. I&apos;m not familiar with Vegan cuisine but now I&apos;m game to venture more into it. \n\nOn a last note: there is major gentrification that has been happening in the Lower East Side of Manhattan for years and I for one do welcome it. Some might argue about it but I think it makes the neighborhood more upscale."}],"aggregateRating":{"#type":"AggregateRating","ratingValue":4.5,"reviewCount":1078},"servesCuisine":"Mexican"}
</script>
How would I access something in this particular script tag and the json info inside of it using BeautifulSoup? Or is there a way to do it via the <a> tag with lxml and .find_all()?
You could use json module to parse content of script tags, which is accessible by .text field
Here is the example of parsing all script jsons and printing name:
import json
import requests
from bs4 import BeautifulSoup
url = 'https://www.yelp.com/biz/jajaja-plantas-mexicana-new-york-2?osq=Vegetarian+Food'
r = requests.get(
url,
headers={'User-Agent':'Mozilla/5.0'},
)
parsed_page = BeautifulSoup(r.text, 'html.parser')
page_jsons = []
for x in parsed_page.select('script[type="application/ld+json"]'):
try:
data = json.loads(x.text)
except:
continue
page_jsons.append(data)
for d in page_jsons:
if d.get('name'):
print(d['name']) # => Jajaja Plantas Mexicana

Transforming sentences to Numbers using SciKit-Learn’s CountVectorizer()

I am trying to convert a input sentence Review into a CountVectorizer. I am struggling to handle the sentences that are passed through. How do I deal with the sentences and add vectors to these? Any assistance will be highly appreciated.
Input Data:
Sentiment Review
Neg The new Ford Focus came highly recommended to me when I was looking to buy my first new car I researched its history and found that it received great reviews for comfort and safety during its European release Test driving the car I found it to be comfortable well equipped and stylish I have now driven the car for for 6 months and have put only 5000 miles on it While I have been happy with the overall performance of the car I have been sorely disappointed with the workmanship involved Realizing that new models are notorious for having manufacturing bugs I felt somewhat reassured that these would have been worked out from 1998 1999 during the first European release I was wrong My car has been in the repair shop a total of five times for manufacturers defects including a flooded passenger compartment repaired twice to date faulty master clutch cylinder misaligned striker plate on seat back latch broken break switch and cruise control While I really love my car I would hesitate to recommend it to any but my worst enemies Time will tell if the problems my Focus has had are unique or are related to intrinsic design flaws
Neg We bought the Focus ZTS sedan because my wife needed an economical car to haul the grandkids around with We traded in a 94 Explorer with a 5 speed just before the Firestone tire fiasco became public My wife loves driving the car Although it is a bit small for me 6 1 290lbs it is OK The car handles great and with the Zetec engine it has adequate performance although I wouldnt want any less go than its got Now for the problems the main one of which is because I do my own oil changes A particular sore point for me with most cars is that the manufacturers dont make it easy to change the oil and filter without creating a mess This new Focus is particularly bad First the owners manual indicates a Motocraft FL2005 filter The car had an FL801 on it which some ham fisted factory idiot had torqued to about a million foot pounds I had to use some very large pliers and turn the filter almost 3 4 turn before it was loose enough to move by hand Poor quality control The filter happens to be mounted in a horizontal position and is almost flush with the side of the engine When I finally got it loose oil ran down the side of the engine onto the drive axle onto the frame down my arm and all over the driveway Very bad design On other cars I have been able to use a cut off soda bottle placed over the filter to catch the drips On the Focus it wont work The hood on this car is aluminum It bends very easy mine already has a dent in it and I didnt have an accident A minor problem is the power windows They wont operate with the key in the accessory position Tilt wheel also difficult to operate Bottom line only 3 000 miles on this car but its going to get traded off as soon as possible for a vehicle with a little more substance and which is easier to maintain ive owned 9 Fords since 1986 still have 3 If all the newer Fords are made this way the Focus may be the last Ford product I buy
Neg Recently I had the need to rent a car I picked the Ford Focus I was amazed with this car I liked it better than my own more expensive 1999 Toyota Corolla LE The steering wheel is not only height adjustable but also telescopes something you do not normally find on such a reasonably priced car The drivers seat also adjusted forward and back and in height nice feature for someone tall like myself The front seats were roomy and comfortable and the back seat had I think the most leg room I HAVE EVER SEEN in a compact car The stereo sounded good considering it was stock and the face of the radio has an upward tilt to it so that it is driver friendly All the bells and whistles were located within easy reach and the air worked well In addition to having a roomy trunk there are 60 40 split rear seats Child safety seat anchors and shoulder harness seat belts for 5 passengers I rented the 4 door sedan but there are 3 body styles The 4 door sedan 4 door wagon and a sporty little hatch back I have read the safety ratings for the hatch back and from what I recall it got 5 stars This car is definately on my list of cars to consider purchasing in the near future you should take a look at it too
Pos Cruising In My Big T I have had my 91 Thunderbird for 4 years now bought it way back in my freshman year and it has served me well throughout college I am a horrible Northern driver and brutal on my vehicles but this piece of Ford craftsmanship refuses to bail out on me Its a rough and tumble vehicle that remains an incredible deal for the price especially when bought used from a reputable dealer The Advantages 1 Seat Space These are big seats people with the kind of legroom that only those pretentious you know whats in first class usually get their hands on And that spaciousness isnt just about spoiling the people up front either it extends to the back seat as well which means that everyone feels just a little bit more comfortable and relaxed when you get to wherever you re going And not only are the seats big but the generous amount of padding in each makes for an especially comfortable ride 2 Appearance ive gt to admit it to you I just love the look of the Thunderbird though it is an acquired taste to be sure I can best describe the style as Italian sleek in a chunky way and available in colors like burgundy that make it look like a cross between a hit mobile and a hearse 3 Smooth Ride Riding in my Thunderbird has always seemed quite smooth to me especially when you consider how low to the ground it is Why so low That kind of positioning allows the Thunderbird to provide the rider with great control as your feel for the road is significantly enhanced In the same arena as the ride is the ease of use of the console which for me equals smoothness and ease The Thunderbirds radio and air console is incredibly well designed with everything within reach and intuitively organized Seem trivial to you Try changing the station at 75 miles an hour and see how important knob placement is 4 Trunk This is a important feature for me as I seem to move every 3 6 months The trunk on the Thunderbird is big enough for all of your luggage not to mention the corpse of Vinnie The Chin from a rival family My Defense I have read another review of this vehicle that criticizes the brake quality and I have to vehemently disagree with it I ride my brakes hard and I have never had a lockup or other incident The brakes do tend to squeek a bit but the noise is no indication of a performance issue The Final Verdict The bottom line is that the Thunderbird is a comfortable and well designed car at a reasonable price As long as you like burgundy vehicles and live in an area thats at least 30 Italian the Thunderbird is a great option
Pos I arrived in the states from Australia at the end of March 1999 to stay there for a year and come home at the end of March 2000 I stayed with friends in South Carolina who is a Ford man as I have always owned GM or Chevs they lent me a 1979 red corvette until I bought myself a car so after 3 months I did buy a 1985 Z28 Camaro 350 to fix up and use for the 9 months after looking at it I thought this was a bad idea so looked in the local paper and found a red 1991 V8 Thunderbird with 114 000 miles on it for 3000 After taking it for a test drive offered the lady 2800 and drove it home it had a slight water leak from the water pump so while replacing it I installed a set of under drive pulleys which I could notice the power increase the first time I drove it put a K amp N air cleaner in as well I had a friend come over from Australia so we drove from Greenville SC across to Sequin TX did about 3000 miles in that trip we took the long way and had no trouble at all and got 27 MPG sitting on 80 MPH had a radar detector it has a highway ratio in it 2 75 My brother came over from Australia so we went from Greenville SC down to Daytona Beach and back then drove across America to California which we did about 4600 miles trouble free When we left Williams AZ the car was buried under snow as we had a cold snap and snow dig the snow away and turn the key starting the engine at once and never missing a beat The car came with the premium sound system but the radio cassette was playing up so replaced it with a Pioneer radio CD I went to the wreckers and bought an electric motor seat assembly for the right hand side so when converting to RHD will have an electric adjustment Also bought a sports instrument cluster and centre handbrake assy out of a super coupe These cars were never made for export or for Right Hand Drive so have to get all the parts needed now for conversion I did 18 000 miles in 9 months without the car stopping or letting me down I gave the car 4 oil changes added fuel octane booster with every tank of gas it has the factory 15 alloy wheels with Michelin tyres I found the car very easy to drive and steer but did experience brake shudder which appears to be a common problem due to thin brake rotors I added a rear spoiler and had the windows tinted which makes the car look sporty as in Australia the only 2 door cars are mainly Jap imports so in the end I shipped the car back to Australia where I have to convert it over to Right Hand Drive for our road rules this cars owes me 10 000 Australia 5200 US landed back at my house in Australia which when converted to RHD they sell from 35 000 to 40 000 18 000 to 21 000 US BEST CAR I HAVE EVER OWNED
Pos This review is about Ford Mustang 3 8L Coupe with stick shift I test drove when I considered buying it I say considered because I did not buy it and here is why Test Drive The dealer talked too much during the test drive They always try to do that to distract you but I noticed the following things Styling You can argue but I think it could be better The car looks bulky the C pillars are thick which increases blind spots I was afraid to run over somebody while backing up the standard wheels look crude The previous Mustang looked more balanced Engine The 3 8L 193 hp engine does not seem all that powerful even with stick We went on the freeway onramp and I was disappointed Strange considering the 220 lb ft of torque rating at as low as 2800 rpm European and Japanese manufacturers manage to extract more than 200 hp out of 3 0 liter engines Note the A C was on during the test drive and was very efficient It might eat some power but not that much Transmission The shifter has quite short travel which is good but the clutch does not provide any feedback you cannot feel it engage by the pedal pressure or the dealer talked too much The clutch also engaged very high in the pedal travel I drove some Eastern European cars for several years and never had complaints like this one Or maybe im getting old and grumpy Suspension The suspension is not only stiff but creates a lot of unnecessary up and down motions The car uses live axle in the rear so I didnt expect much anyway Standard Equipment The list of standard equipment looks good It includes power windows mirrors locks and remote keyless entry alloy ugly wheels AM FM CD cassette player A C dual vanity mirrors etc Interior Interior materials fit and finish looks cheap I did not expect walnut for 16K but Ford could have done better As I said the C pillars are wide in coupe and the interior room is smaller than Id like The steering wheel tilts but does not telescope which might be a problem for the tall people Insurance and Safety Insurance rates are high especially if you are a male younger than 25 The crash test results are not encouraging either the overall rating is Acceptable with Poor death rate and Marginal injury rate Fuel Economy I didnt get a chance to see the actual fuel consumption myself but on paper its 19 MPG city 29 MPG highway Not impressive for the car of this size with manual transmission Warranty and Reliability Consumer Reports magazine says that Mustang has poor reliability Ford provides 36 000 mile 3 year warranty and 5 year corrosion warranty Majority of other manufacturers offers 60 000 mile 5 year powertrain warranty 100 000 mile 10 year warranty for Hyundai Kia The last three safety fuel economy and reliability also depend on the way you drive Pricing The price was good in theory I know that you can get the car for less than 16K at CarsDirect com for example but the particular dealership I went to wanted more than 17K and did not want to negotiate the price at all Besides they were very pushy and rude Needles to say they did not earn my business they didnt even try The dealer was constantly asking what monthly payment I can afford Well I can afford the payment I need to get better car I walked after which they called me several times asking how they can make me buy the car today I was unable to produce any kind of positive reply on this one I In car buying a lot depends on personal taste If you like Mustangs styling and features and decide to buy it it is a good deal providing you with electric everything remote keyless entry radio CD cassette V6 engine and alloy wheels for less than 16 If you want refinement fit and finish safety and reliability get ready to pay more for something else I
Code attempt:
from sklearn.feature_extraction.text import CountVectorizer
#instantiate the class
cv = CountVectorizer()
#list of sentences
for i in range(len(df['clean_Review'])):
text=df.loc[i, "clean_Review"]
#tokenize and build vocab
cv.fit(text)
print(cv.vocabulary_)
#transform the text
vector = cv.transform(text)
print(vector.toarray())
#df.loc[i,"porter"]=test
i=i+1
You don't need the looping. From the documentation:
from sklearn.feature_extraction.text import CountVectorizer
#instantiate the class
cv = CountVectorizer()
#vector is a sparse matrix storing individual words as "bag of words" model
vector = cv.fit_transform(df["clean_Review"].copy())
I assume that you have performed corpus cleaning step (lowercase, ascii encoding, stopword removal, etc) before using CountVectorizer to convert your model to bag of words, therefore I have kept the arguments of CountVectorizer() empty.
Example:
from sklearn.feature_extraction.text import CountVectorizer
import pandas as pd
#sample text corpus
corpus = pd.Series(["aa bb cc dd ee","bb cc dd ee","cc dd ee","dd ee","ee","ee ff"])
#instantiate the class
cv = CountVectorizer()
vector = cv.fit_transform(corpus)
print(corpus)
0 aa bb cc dd ee
1 bb cc dd ee
2 cc dd ee
3 dd ee
4 ee
5 ee ff
dtype: object
print(vector.toarray())
array([[1, 1, 1, 1, 1, 0],
[0, 1, 1, 1, 1, 0],
[0, 0, 1, 1, 1, 0],
[0, 0, 0, 1, 1, 0],
[0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 1, 1]])

Extract data points from text

I'd like to extract data from a sentence. I know that this is not a simple and easy process but was wondering what would be the best way about doing so. For example I'd have the sentence:
The silver boat moved through the dark waters, light by the moonlight.
From that I'd like to get:
Boat: sliver, moving through water //moving through water referring to Water:
Water: dark, light by the moonlight, boat moving //boat moving referring to Boat:
Moonlight: lighting water
Basically I'd like to get all the words that describe the noun whether a single word or short phrase. Some details are kept both twice (the item Boat has a reference to water and the item Water reference to boat)
I was thinking about getting a list of nouns and finding all them in the sentence. Then I'd somehow get everything describing it (all the links).
If someone has already done something similar or if there's a way to get all related data from a sentence, please tell me or give a link.

Extracting (human) names from a description

I'm writing a piece of software that processes TV guide listings and converts them to XMLTV.
I've noticed that a lot of my descriptions contains who hosts the show - and I would like to be able to extract that information.
One of the methods I first looked at was regex, however my regex skills aren't great, and there doesn't seem to be a great way to achieve it anyway.
Another option is NLP, however that seems to be a bit over the top for what I need, especially since my descriptions share a common prefix (Hosted by). However, this may be the method I will go with, as it could be the most reliable and easy to use.
For reference, here is an example dataset - some are real, some are made up.
['Hosted by Jim Bolger, John James and Jim Bob, The Project is a show that exists',
'Hosted by Lisa Owen, Newshub Nation is an in-depth weekly current affairs show focusing on the major players and forces that shape New Zealand.',
'A fast paced wrap of all things entertainment, celebrity and Bravo hosted by Cassidy Morris',
'Hosted by chef Guy Fieri, Minute To Win It sees competitors take on a series of seemingly simple tasks while under a one-minute time limit.',
'Hosted by Jim van de Allen, Tom Scott and Petra Grazing, this is fair go',
'Hosted by Zyon Zickle, Johnny Boi and Zippy De Phrasee, The News looks at the important things that affect all Martians',
'Lorem is a magical substence wondered about by generations of things. This series hosted by Jim Tokien, explores this thingie']
I would much rather quality over quantity - so I'd rather it find less that matches, but most of the matches to be accurate rather than a lot of inaccurate matches.
Am I overthinking this? Is there a simpler way to do it? Any help would be greatly appreciated.

How to add punctuation to text using python?

I am playing with the IBM Watson Speech To Text Service API. For those who do not know this service is being used to transcribe audio. You upload an audio file to the service and it returns back the text. The service has been good so far, but the problem is that that returned text does not contain punctuation marks. I tried solving this problem with nltk, but no results.
Some nltk code I have tried.
#string is the text
string = """Hey guys today I'm gonna show you how to make bulletproof coffee so free guys and never heard a blooper coffee it's been around since like two thousand two been around for awhile but i think it's been a lot more popular now probably within the last maybe year two years so for me I recently just started doing bulletproof coffee say about a couple so basically regular coffee you know you get your regular coffee and then you put cream and you put sugar mix it up to get this nice little creamy sweet taste so bulletproof coffee the differences as they're putting cream and sugar your you play coffee you put butter grass fed butter and coconut oil so the difference is instead of using sugar you're using and healthy fats and the point of that is it's really good for if you're asked about the cato jenneke dieter basically low carb diet last you won't do is load up your coffee with how much sugar and roads are low carb diet and also a lot of people actually use bulletproof coffee as a way to replace their practice now i'm not saying to do that because albion is as of right now I still you might break I drink my coffee and at all times a wreck this but again though I do eat a lot so home a lot more than a lot of normal people so keep this in mind okay so this is how the rescue workers the book per copy you want to go ahead and put two cups of coffee so it's a here is we have one cup put it into our blender the trick here is the blending park you don't blend it doesn't come out right I use it not blended and then that coconut oil and the butter cut just floats on top when you got this oily coffee taste not the not the best now two cups of coffee you put two teaspoons of butter you wanna go and grass fed butter so let's say this is one teaspoon here yeah I just gonna estimated this will be two teaspoons every go two teaspoons of butter and how it I guess and crazy I remember doing this as a man sir put butter in there but it actually doesn't and it's really really good then you put about one to two teaspoons of cooking oil so for me i'm going to go with so here's about probably about one tablespoon and here's to get there you go throw on the cat and so not only is a good to be honest i think on a local wow it works out really well sugar is my car but when it comes to drinking the i can't really put any pre sugar sugar so what i've basically been doing a lot and you know black take the paper personally i can enjoy the cream and sugar so i thought this to be a really good replacement it makes is reading the taste and i think it's leann when i used even though i'm not sugar at all so it tastes good good for you and it's a low carb no take a look got a little foam on the top uh man fiasco we smell this now the butter coconut so check this out you get this nice little cream uh this is a taste test so smooth so creamy and i never put any sugar and a lot of people come in and play like go you know so we can love or something the v. f. or something the person for me albeit like this is perfect you know it's creamy and it's not bitter like black coffee plus it fits nicely with a low carb diet so a lot people drink this morning they replace directors because this does have a lot of calories so here's something to look out for them he noticed two teaspoons of butter want to tease bozo coconut oil a lot of calories just give an example one tablespoon of coconut oil has but a hundred and thirty calories which means two of them it's like two hundred sixty calories the butter is likely to borrow from that so you're looking at sort of ballpark a four hundred calories plus so that's what a lot of people drink this as replacement for their breakfast and with the good fats they give them the energy that they need to go throughout the day and also because that you watched another video talk about the difference we fat and carbs are sugar that exhort slower and so the energy lasts longer it doesn't get any sort of super fast and gets burnt off versus when you let's say drink sugary coffee you know we put sugar and cream in your coffee that sugar gets exhort super fast so you get this spike in energy up in the morning and then it starts crashing down but i noticed drinking this you don't gain topic crashes because it's back they said sugar so exhort slowly and kinda sustains so this is how i make my coffee i love it i suggest give it a try guys especially if you're one of coffee drinkers replacing your cream in your sugar with some bulletproof coffee you try some coconut oil and grass fed butter so i tried our guys see i like more workouts mortician how to work out the ads the most is that you want sixpack shortcuts dot com piece"""
import nltk
n = nltk.tokenize.punkt.PunktSentenceTokenizer()
n.sentences_from_text(string)
Is there any way to solve my problem? What would you do if you were in my case?
Unless you've got some serious machine learning skills, you'll have to write your own "rules" to handle this. Rules are must easier for this use case to get started.
A sentence must adhere to specific "grammar rules". These rules are pretty basic hence the reason they can be taught in primary school. Problem is most people don't follow grammar rules when writing or speaking. You'll first need to focus on text that follows the rules. Based on this premise here is what I would do
Run NLTK POS Tagger on string (once we know the Parts of Speech we can code rules)
Identify tokens whose part of speech breaks the grammar rules of a sentence (sentences shouldn't end in a preposition)
Identify tokens that a sentence could end with based on grammar rules (Nouns are a great start)
Find a large well punctuated grammatically correct corpora of text and remove the punctuation.
Try your "new rules" out on adding punctuation to the corpora. Adjust your rules. Rinse and Repeat
It's 2022 and huggingface has some easy-to-install machine learning solutions for adding punctuation back into transcripts:
https://huggingface.co/felflare/bert-restore-punctuation
Example:
from rpunct import RestorePuncts
# The default language is 'english'
rpunct = RestorePuncts(use_cuda=False)
rpunct.punctuate("""in 2018 cornell researchers built a high-powered detector that in combination with an algorithm-driven process called ptychography set a world record
by tripling the resolution of a state-of-the-art electron microscope as successful as it was that approach had a weakness it only worked with ultrathin samples that were
a few atoms thick anything thicker would cause the electrons to scatter in ways that could not be disentangled now a team again led by david muller the samuel b eckert
professor of engineering has bested its own record by a factor of two with an electron microscope pixel array detector empad that incorporates even more sophisticated
3d reconstruction algorithms the resolution is so fine-tuned the only blurring that remains is the thermal jiggling of the atoms themselves""")
This model was trained on 560,000 yelp sentences and has 90% accuracy.
install note
As of Sep 2022 the rpunct package has a bug where it won't run without GPUs, but there's a patched fork that supports the use_cuda=False kwarg in my example (slower but works on all processors). To install this fork instead, do this:
pip3 install git+https://github.com/ernie-mlg/rpunct.git
option 2 (better)
This is yet another huggingface model with similar accuracy and installed correctly the first time
https://huggingface.co/oliverguhr/fullstop-punctuation-multilang-large
pip install deepmultilingualpunctuation
usage:
>>> from deepmultilingualpunctuation import PunctuationModel
>>> model = PunctuationModel()
Downloading config.json: 100%|█████████████████████████████████████████████████████████████████| 892/892 [00:00<00:00, 335kB/s]
Downloading pytorch_model.bin: 100%|██████████████████████████████████████████████████████| 2.08G/2.08G [04:54<00:00, 7.60MB/s]
Downloading tokenizer_config.json: 100%|███████████████████████████████████████████████████████| 406/406 [00:00<00:00, 216kB/s]
Downloading sentencepiece.bpe.model: 100%|████████████████████████████████████████████████| 4.83M/4.83M [00:00<00:00, 8.08MB/s]
Downloading special_tokens_map.json: 100%|█████████████████████████████████████████████████████| 239/239 [00:00<00:00, 158kB/s]
/opt/anaconda3/envs/punct/lib/python3.9/site-packages/transformers/pipelines/token_classification.py:135: UserWarning: `grouped_entities` is deprecated and will be removed in version v5.0.0, defaulted to `aggregation_strategy="none"` instead.
warnings.warn(
>>> text = "My name is Clara and I live in Berkeley California Ist das eine Frage Frau Müller"
>>> result = model.restore_punctuation(text)
>>> print(result)
My name is Clara and I live in Berkeley, California. Ist das eine Frage, Frau Müller?
I verified this actually worked out-of-the-box in a fresh conda env above. This 2nd model also auto-detects and supports 4 languages.
You could leverage Punctuator2 Python library to either train your own model to detect punctuation or use a pretrained one.

Categories

Resources