Graph python similar to R - python
I have a table like this one: (Ignore the columns "Index" and "D")
+-------+-----------------------+----------+----------+----------+
| Index | Type | Male | Female | D |
+-------+-----------------------+----------+----------+----------+
| 44 | Life struggles | 2.097324 | 3.681356 | 1.584032 |
| 2 | Writing notes | 2.677262 | 3.354730 | 0.677468 |
| 18 | Empathy | 3.528117 | 4.083051 | 0.554933 |
| 12 | Criminal damage | 2.926650 | 2.374150 | 0.552501 |
| 20 | Giving | 2.650367 | 3.196944 | 0.546577 |
| 21 | Compassion to animals | 3.666667 | 4.178268 | 0.511602 |
| 33 | Mood swings | 2.965937 | 3.451613 | 0.485676 |
| 10 | Funniness | 3.574572 | 3.104907 | 0.469665 |
| 38 | Children | 3.354523 | 3.805415 | 0.450891 |
| 47 | Small - big dogs | 3.221951 | 2.801695 | 0.420256 |
+-------+-----------------------+----------+----------+----------+
and I am trying to do a similar graph :
I know how to do it in R but not in python
I tried this:
sns.stripplot(data=df,y="Male",color="Blue")
sns.stripplot(data=df,y="Female",color="red")
But I don't know how to continue. Does someone have am idea?
This is easily done with matplotlib, it is simply a scatter plot with categories as y-values.
plt.style.use('ggplot')
fig, ax = plt.subplots()
ax.plot(df['Male'],df['Type'],'o', color='xkcd:reddish', ms=10, label='Male')
ax.plot(df['Female'],df['Type'],'o', color='xkcd:teal', ms=10, label='Female')
ax.axvline(3,ls='-',color='k')
ax.set_xlim(1,5)
ax.set_xlabel('avg response')
ax.set_ylabel('Variable')
ax.legend(bbox_to_anchor=(0.5, 1.02), loc='lower center',
ncol=2, title='group')
fig.tight_layout()
Related
Plot multiple bar plots for multiple columns
I have a dataset that looks roughly like the table below. I need to create a barplot for each column TS1 to TS5 that counts the number of each item in that column. The items are one of the following: NOT_SEEN NOT_ABLE HIGH_BAR and numerical values between 110 and 140 separated by 2 (so 110, 112, 114 etc). I have found a way to do this which works fine but what I am asking is if there is a way to create a loop or something so I don't have to copy paste the same code 5 times (for the 5 columns)? This is what I have tried and working: num_range = list(range(110,140, 2)) OUTCOMES = ['NOT_SEEN', 'NOT_ABLE', 'HIGH_BAR'] OUTCOMES.extend([str(num) for num in num_range]) OUTCOMES = CategoricalDtype(OUTCOMES, ordered = True) fig, ax =plt.subplots(2, 3, sharey=True) fig.tight_layout(pad=3) This below is what I copy 5 times and only change the title (Testing 1, Testing 2 etc) and TS1 TS2.. (in the first line). df["outcomes"] = df["TS1"].astype(OUTCOMES) bpt=sns.countplot(x= "outcomes", data=df, palette='GnBu', ax=ax[0,0]) plt.setp(bpt.get_xticklabels(), rotation=60, size=6, ha='right') bpt.set(xlabel='') bpt.set_title('Testing 1') Then the following code is below the "5" instances of the above. ax[1,2].set_visible(False) plt.show() I am sure there is a way to do this that is much better but I'm new to all this. Also, I need to make sure the bars of the barplot are ordered going left to right as: NOT_SEEN NOT_ABLE HIGH_BAR and 110, 112, 114 etc Using python 2.7 (not my choice unfortunately) and pandas 0.24.2. +----+------+------+----------+----------+----------+----------+----------+ | ID | VIEW | YEAR | TS1 | TS2 | TS3 | TS4 | TS5 | +----+------+------+----------+----------+----------+----------+----------+ | AA | NO | 2005 | | 134 | | HIGH_BAR | | +----+------+------+----------+----------+----------+----------+----------+ | AB | YES | 2015 | | | NOT_SEEN | | | +----+------+------+----------+----------+----------+----------+----------+ | AB | YES | 2010 | 118 | | | | NOT_ABLE | +----+------+------+----------+----------+----------+----------+----------+ | BB | NO | 2020 | | | | | | +----+------+------+----------+----------+----------+----------+----------+ | BA | YES | 2020 | | | | NOT_SEEN | | +----+------+------+----------+----------+----------+----------+----------+ | AA | NO | 2010 | | | | | | +----+------+------+----------+----------+----------+----------+----------+ | BA | NO | 2015 | | | | | 132 | +----+------+------+----------+----------+----------+----------+----------+ | BB | YES | 2010 | | HIGH_BAR | | 140 | NOT_ABLE | +----+------+------+----------+----------+----------+----------+----------+ | AA | YES | 2020 | | | | | | +----+------+------+----------+----------+----------+----------+----------+ | AB | NO | 2010 | | | | 112 | | +----+------+------+----------+----------+----------+----------+----------+ | AB | YES | 2015 | | | NOT_ABLE | | HIGH_BAR | +----+------+------+----------+----------+----------+----------+----------+ | BB | NO | 2020 | | | | 145 | | +----+------+------+----------+----------+----------+----------+----------+ | BA | NO | 2015 | | 110 | | | | +----+------+------+----------+----------+----------+----------+----------+ | AA | YES | 2010 | HIGH_BAR | | | NOT_SEEN | | +----+------+------+----------+----------+----------+----------+----------+ | BA | YES | 2015 | | | | | | +----+------+------+----------+----------+----------+----------+----------+ | AA | NO | 2020 | | | | 118 | | +----+------+------+----------+----------+----------+----------+----------+ | BA | YES | 2015 | | 180 | NOT_ABLE | | | +----+------+------+----------+----------+----------+----------+----------+ | BB | YES | 2020 | | NOT_SEEN | | | 126 | +----+------+------+----------+----------+----------+----------+----------+
You can put plotting lines in a function and call it in a for loop automatically changing column, title and axis in each iteration: fig, axes =plt.subplots(2, 3, sharey=True) fig.tight_layout(pad=3) def plotting(column, title, ax): df["outcomes"] = df[column].astype(OUTCOMES) bpt=sns.countplot(x= "outcomes", data=df, palette='GnBu', ax=ax) plt.setp(bpt.get_xticklabels(), rotation=60, size=6, ha='right') bpt.set(xlabel='') bpt.set_title(title) columns = ['TS1', 'TS2', 'TS3', 'TS4', 'TS5'] titles = ['Testing 1', 'Testing 2', 'Testing 3', 'Testing 4', 'Testing 5'] for column, title, ax in zip(columns, titles, axes.flatten()): plotting(column, title, ax) axes[1,2].set_visible(False) plt.show()
Plotting for next row after the slice
I am plotting values of column X and FT according to column CN value in the following code import matplotlib.pyplot as plt, plt.plot(X[CN==1],FT[CN==1]), plt.plot(X[CN==36],FT[CN==36]) and the data is given as +-------+-----+----+-------+-------+ | X | N | CN | Vdiff | FT | +-------+-----+----+-------+-------+ | 524 | 2 | 1 | 0.0 | 0.12. | | 534 | 2 | 1 | 0.0 |0.134. | | 525 | 2 | 1 | 0.0 |0.154. | | . | | | |. | | . | | | |. | | 5976 | 15 | 14 | 0.0 |3.54. | | 5913 | 15 | 14 | 0.1 |3.98. | | 5923 | 0 | 15 | 0.0 |3.87. | | . | | | |. | | . | | | |. | | 33001 | 7 | 36 | 0.0 |7.36 | | 33029 | 7 | 36 | 0.0 |8.99 | | 33023 | 7 | 36 | 0.1 |12.45 | | 33114 | 0 | 37 | 0.0 |14.33 | +-------+-----+----+-------+-------+ I am getting incomplete graphs so I need to use 1 next row in my plot. For example for the graph of CN==36 as plt.plot(X[CN==36],FT[CN==36]) I want to use first row of CN==37 in my plot. Note that CN values are repetitive. I have to plot multiple graphs in this way so a general code above graphs will be appreciated. Addition on request in comment: Check at the end of the circular shape they are not touching their edges so circle is incomplete. for example for aqua & green color cycles. I want complete cycles so I need 1 or 2 additonal rows in data to plot.
Plotting a CDF from a multiclass pandas dataframe
I understand the package empiricaldist provides a CDF function as per the documentation. However, I find it tricky to plot my dataframe in the column has multiple values. df.head() +------+---------+---------------+-------------+----------+----------+-------+--------------+-----------+-----------+-----------+-----------+------------+ | | trip_id | seconds_start | seconds_end | duration | distance | speed | acceleration | lat_start | lon_start | lat_end | lon_end | travelmode | +------+---------+---------------+-------------+----------+----------+-------+--------------+-----------+-----------+-----------+-----------+------------+ | 0 | 318410 | 1461743310 | 1461745298 | 1988 | 5121.49 | 2.58 | 0.00130 | 41.162687 | -8.615425 | 41.177888 | -8.597549 | car | | 1 | 318411 | 1461749359 | 1461750290 | 931 | 1520.71 | 1.63 | 0.00175 | 41.177949 | -8.597074 | 41.177839 | -8.597574 | bus | | 2 | 318421 | 1461806871 | 1461806941 | 70 | 508.15 | 7.26 | 0.10370 | 37.091240 | -8.211239 | 37.092322 | -8.206681 | foot | | 3 | 318422 | 1461837354 | 1461838024 | 670 | 1207.39 | 1.80 | 0.00269 | 37.092082 | -8.205060 | 37.091659 | -8.206462 | car | | 4 | 318425 | 1461852790 | 1461853845 | 1055 | 1470.49 | 1.39 | 0.00132 | 37.091628 | -8.202143 | 37.092095 | -8.205070 | foot | +------+---------+---------------+-------------+----------+----------+-------+--------------+-----------+-----------+-----------+-----------+------------+ Would like to plot CDF for the column travelmode for each travel mode. groups = df.groupby('travelmode') However, I don't really understand how this could be done from the documentation.
You can plot them in a loop like import matplotlib.pyplot as plt def decorate_plot(title): ''' Adds labels to plot ''' plt.xlabel('Outcome') plt.ylabel('CDF') plt.title(title) for tm in df['travelmode'].unique(): for col in df.columns: if col != 'travelmode': # Create new figures for each plot fig, ax = plt.subplots() d4 = Cdf.from_seq(df[col]) d4.plot() decorate_plot(f"{tm} - {col}")
Multi-Index Lookup Mapping
I'm trying to create a new column which has a value based on 2 indices of that row. I have 2 dataframes with equivalent multi-index on the levels I'm querying (but not of equal size). For each row in the 1st dataframe, I want the value of the 2nd df that matches the row's indices. I originally thought perhaps I could use a .loc[] and filter off the index values, but I cannot seem to get this to change the output row-by-row. If I wasn't using a dataframe object, I'd loop over the whole thing to do it. I have tried to use the .apply() method, but I can't figure out what function to pass to it. Creating some toy data with the same structure: #import pandas as pd #import numpy as np np.random.seed = 1 df = pd.DataFrame({'Aircraft':np.ones(15), 'DC':np.append(np.repeat(['A','B'], 7), 'C'), 'Test':np.array([10,10,10,10,10,10,20,10,10,10,10,10,10,20,10]), 'Record':np.array([1,2,3,4,5,6,1,1,2,3,4,5,6,1,1]), # There are multiple "value" columns in my data, but I have simplified here 'Value':np.random.random(15) } ) df.set_index(['Aircraft', 'DC', 'Test', 'Record'], inplace=True) df.sort_index(inplace=True) v = pd.DataFrame({'Aircraft':np.ones(7), 'DC':np.repeat('v',7), 'Test':np.array([10,10,10,10,10,10,20]), 'Record':np.array([1,2,3,4,5,6,1]), 'Value':np.random.random(7) } ) v.set_index(['Aircraft', 'DC', 'Test', 'Record'], inplace=True) v.sort_index(inplace=True) df['v'] = df.apply(lambda x: v.loc[df.iloc[x]]) Returns error for indexing on multi-index. To set all values to a single "v" value: df['v'] = float(v.loc[(slice(None), 'v', 10, 1), 'Value']) So inputs look like this: -------------------------------------------- | Aircraft | DC | Test | Record | Value | |----------|----|------|--------|----------| | 1.0 | A | 10 | 1 | 0.847576 | | | | | 2 | 0.860720 | | | | | 3 | 0.017704 | | | | | 4 | 0.082040 | | | | | 5 | 0.583630 | | | | | 6 | 0.506363 | | | | 20 | 1 | 0.844716 | | | B | 10 | 1 | 0.698131 | | | | | 2 | 0.112444 | | | | | 3 | 0.718316 | | | | | 4 | 0.797613 | | | | | 5 | 0.129207 | | | | | 6 | 0.861329 | | | | 20 | 1 | 0.535628 | | | C | 10 | 1 | 0.121704 | -------------------------------------------- -------------------------------------------- | Aircraft | DC | Test | Record | Value | |----------|----|------|--------|----------| | 1.0 | v | 10 | 1 | 0.961791 | | | | | 2 | 0.046681 | | | | | 3 | 0.913453 | | | | | 4 | 0.495924 | | | | | 5 | 0.149950 | | | | | 6 | 0.708635 | | | | 20 | 1 | 0.874841 | -------------------------------------------- And after the operation, I want this: | Aircraft | DC | Test | Record | Value | v | |----------|----|------|--------|----------|----------| | 1.0 | A | 10 | 1 | 0.847576 | 0.961791 | | | | | 2 | 0.860720 | 0.046681 | | | | | 3 | 0.017704 | 0.913453 | | | | | 4 | 0.082040 | 0.495924 | | | | | 5 | 0.583630 | 0.149950 | | | | | 6 | 0.506363 | 0.708635 | | | | 20 | 1 | 0.844716 | 0.874841 | | | B | 10 | 1 | 0.698131 | 0.961791 | | | | | 2 | 0.112444 | 0.046681 | | | | | 3 | 0.718316 | 0.913453 | | | | | 4 | 0.797613 | 0.495924 | | | | | 5 | 0.129207 | 0.149950 | | | | | 6 | 0.861329 | 0.708635 | | | | 20 | 1 | 0.535628 | 0.874841 | | | C | 10 | 1 | 0.121704 | 0.961791 |
Edit: as you are on pandas 0.23.4, you just change droplevel to reset_index with option drop=True df_result = (df.reset_index('DC').assign(v=v.reset_index('DC', drop=True)) .set_index('DC', append=True) .reorder_levels(v.index.names)) Original: One way is putting index DC of df to columns and using assign to create new column on it and reset_index and reorder_index df_result = (df.reset_index('DC').assign(v=v.droplevel('DC')) .set_index('DC', append=True) .reorder_levels(v.index.names)) Out[1588]: Value v Aircraft DC Test Record 1.0 A 10 1 0.847576 0.961791 2 0.860720 0.046681 3 0.017704 0.913453 4 0.082040 0.495924 5 0.583630 0.149950 6 0.506363 0.708635 20 1 0.844716 0.874841 B 10 1 0.698131 0.961791 2 0.112444 0.046681 3 0.718316 0.913453 4 0.797613 0.495924 5 0.129207 0.149950 6 0.861329 0.708635 20 1 0.535628 0.874841 C 10 1 0.121704 0.961791
How do I make this bs4 webscraping code travel down a table and store strings into a 2d array?
I am building a web scraper that tracks a changing list using BS, the html tags for the objects I am looking for are generic except for their id's which are unique and constantly changing. I know the top id will always be the same so I have gotten to the point where my output is giving me the top result in the format I need, but I am trying to figure out a way of adding the next nine . I cannot use their id's because they change, so I thought of using .find_next('tr') but I cant figure out how to get it past the second . I know that there must be an elegant solution, but it is my first time using BS4 so I was hoping that someone could point me in the right direction. import requests from bs4 import BeautifulSoup from numpy import np website_url = requests.get ('').text soup = BeautifulSoup(website_url, 'lxml') L = [] H = ["H1","H2","H3"] for derp in soup.find(id='tr-id-1').findAll('a')[0:3:1]: L.append(derp.string) A = np.vstack((H, L)) print(A) This gets me the printed array in the right format, but only for the with the id I entered in the find. I can get the second row by writing- for derp in soup.find(id='tr-id-1').find_next('tr').findAll('a')[0:3:1]: -but i don't know how to get further. I am only trying to scrape the first 10 rows of the table so I am thinking that I might need a while loop with a countdown marker? I am wondering if there is a way to create a loop that selectively takes the next 9 rows and appends the specific column data in the array.
This script prints the table with currencies (you can store data to list or numpy instead): import requests from bs4 import BeautifulSoup url = 'https://coinmarketcap.com' soup = BeautifulSoup(requests.get(url).text, 'lxml') for tr in soup.select('#currencies tr'): if not tr.select('td'): continue for i, td in enumerate(tr.select('td')[:-2]): txt = td.text.replace('\n', ' ').replace('*', '').strip() if i == 0: print('{: ^4}'.format(txt), end='|') else: print('{: ^24}'.format(txt), end='|') print() Prints: 1 | BTC Bitcoin | $196,174,869,053 | $11020.77 | $24,501,665,241 | 17,800,475 BTC | -5.84% | 2 | ETH Ethereum | $30,603,567,177 | $286.61 | $8,821,119,760 | 106,776,759 ETH | -2.40% | 3 | XRP XRP | $16,148,857,177 | $0.379379 | $1,335,082,415 | 42,566,596,173 XRP | -4.06% | 4 | LTC Litecoin | $7,401,989,981 | $118.37 | $4,167,212,036 | 62,533,191 LTC | -4.15% | 5 | BCH Bitcoin Cash | $7,133,878,965 | $399.09 | $1,770,785,779 | 17,875,463 BCH | -3.87% | 6 | EOS EOS | $5,292,523,634 | $5.74 | $2,181,469,151 | 921,990,507 EOS | -2.58% | 7 | BNB Binance Coin | $4,621,088,383 | $32.73 | $267,344,456 | 141,175,490 BNB | -1.87% | 8 | USDT Tether | $3,684,665,566 | $0.999098 | $23,550,580,244 | 3,687,991,972 USDT | -0.53% | 9 | BSV Bitcoin SV | $3,534,271,930 | $197.94 | $351,461,514 | 17,854,986 BSV | -1.55% | 10 | TRX TRON | $2,113,478,617 | $0.031695 | $803,645,870 | 66,682,072,191 TRX | -0.96% | 11 | ADA Cardano | $1,981,827,482 | $0.076439 | $119,258,825 | 25,927,070,538 ADA | -3.69% | 12 | XLM Stellar | $1,940,474,350 | $0.099896 | $358,001,782 | 19,425,036,996 XLM | -2.60% | 13 | LEO UNUS SED LEO | $1,749,404,864 | $1.75 | $12,215,975 | 999,498,893 LEO | -2.01% | 14 | XMR Monero | $1,552,808,370 | $90.96 | $127,576,800 | 17,070,711 XMR | -0.03% | 15 | DASH Dash | $1,360,432,697 | $152.74 | $254,426,418 | 8,906,619 DASH | -4.60% | 16 | LINK Chainlink | $1,248,540,238 | $3.57 | $188,091,151 | 350,000,000 LINK | 4.24% | 17 | NEO NEO | $1,191,827,236 | $16.90 | $490,262,022 | 70,538,831 NEO | -4.08% | 18 | MIOTA IOTA | $1,069,692,929 | $0.384847 | $20,590,490 | 2,779,530,283 MIOTA | -2.74% | 19 | ATOM Cosmos | $1,021,900,211 | $5.36 | $63,724,815 | 190,688,439 ATOM | -4.31% | 20 | ETC Ethereum Classic | $872,993,215 | $7.81 | $751,025,201 | 111,727,165 ETC | -2.11% | 21 | XTZ Tezos | $817,988,097 | $1.24 | $7,008,121 | 658,849,612 XTZ | -1.62% | 22 | XEM NEM | $807,925,560 | $0.089770 | $24,960,771 | 8,999,999,999 XEM | -2.04% | 23 | ZEC Zcash | $700,497,262 | $101.45 | $281,578,113 | 6,905,119 ZEC | -1.92% | 24 | ONT Ontology | $675,289,519 | $1.36 | $133,352,633 | 494,757,215 ONT | -4.79% | 25 | MKR Maker | $655,751,917 | $655.75 | $1,156,367 | 1,000,000 MKR | 1.59% | 26 | CRO Crypto.com Chain | $552,054,533 | $0.071538 | $3,056,529 | 7,716,894,977 CRO | -2.14% | 27 | BTG Bitcoin Gold | $460,804,983 | $26.31 | $11,765,404 | 17,513,924 BTG | -5.22% | 28 | QTUM Qtum | $456,630,457 | $4.76 | $303,751,103 | 95,845,424 QTUM | -6.03% | 29 | DOGE Dogecoin | $452,796,907 | $0.003766 | $160,357,833 | 120,219,215,287 DOGE | 13.34% | 30 | VET VeChain | $416,649,897 | $0.007513 | $57,432,988 | 55,454,734,800 VET | 0.22% | 31 | BAT Basic Attenti... | $372,192,333 | $0.292373 | $28,777,633 | 1,273,006,300 BAT | -1.68% | 32 | USDC USD Coin | $366,029,067 | $0.997092 | $110,990,052 | 367,096,485 USDC | -0.35% | 33 | OMG OmiseGO | $323,389,435 | $2.31 | $102,874,355 | 140,245,398 OMG | -4.46% | 34 | VSYS V Systems | $312,745,092 | $0.178751 | $10,413,916 | 1,749,608,504 VSYS | -2.14% | 35 | DCR Decred | $297,378,169 | $29.63 | $1,739,049 | 10,037,096 DCR | -6.28% | 36 | BTT BitTorrent | $277,023,930 | $0.001306 | $56,422,080 | 212,116,500,000 BTT | -0.66% | 37 | HOT Holo | $229,759,018 | $0.001725 | $25,162,937 | 133,214,575,156 HOT | 0.21% | 38 | EGT Egretia | $222,938,874 | $0.052953 | $39,938,247 | 4,210,121,792 EGT | 2.24% | 39 | TUSD TrueUSD | $213,775,752 | $0.989291 | $131,504,347 | 216,089,898 TUSD | -1.22% | 40 | HC HyperCash | $208,038,166 | $4.78 | $13,740,143 | 43,529,781 HC | -5.15% | 41 | BCD Bitcoin Diamond | $202,441,610 | $1.09 | $3,011,645 | 186,492,898 BCD | -2.68% | 42 | RVN Ravencoin | $199,913,461 | $0.051124 | $15,906,528 | 3,910,345,000 RVN | -4.85% | 43 | HEDG HedgeTrade | $199,512,069 | $0.691805 | $1,395,797 | 288,393,355 HEDG | -4.07% | 44 | HT Huobi Token | $199,033,233 | $3.98 | $96,051,667 | 50,000,200 HT | -1.05% | 45 | AOA Aurora | $196,149,743 | $0.029982 | $10,250,381 | 6,542,330,148 AOA | 3.98% | 46 | LSK Lisk | $195,904,100 | $1.66 | $8,908,589 | 118,280,370 LSK | -4.19% | 47 | NPXS Pundi X | $193,713,239 | $0.000815 | $4,963,859 | 237,816,087,583 NPXS | -3.55% | 48 | KMD Komodo | $188,203,691 | $1.64 | $12,446,638 | 114,883,815 KMD | 5.35% | 49 | BTM Bytom | $188,040,836 | $0.187572 | $54,667,305 | 1,002,499,275 BTM | 11.50% | 50 | WAVES Waves | $177,993,586 | $1.78 | $12,125,048 | 100,000,000 WAVES | -5.44% | 51 | ZRX 0x | $171,782,082 | $0.287372 | $13,858,310 | 597,769,457 ZRX | -2.12% | 52 | QBIT Qubitica | $169,037,263 | $60.18 | $56,989 | 2,808,628 QBIT | -2.17% | 53 | BTS BitShares | $163,390,853 | $0.059810 | $3,167,151 | 2,731,850,000 BTS | 0.32% | 54 | PAX Paxos Standar... | $162,646,375 | $0.997482 | $136,732,457 | 163,056,875 PAX | -0.43% | 55 | NANO Nano | $161,436,979 | $1.21 | $15,637,856 | 133,248,297 NANO | -1.85% | 56 | BCN Bytecoin | $159,764,211 | $0.000868 | $116,637 | 184,066,828,814 BCN | -6.31% | 57 | REP Augur | $156,392,893 | $14.22 | $5,068,790 | 11,000,000 REP | -3.05% | 58 | NRG Energi | $156,304,538 | $8.70 | $1,129,572 | 17,972,740 NRG | -1.29% | 59 | MONA MonaCoin | $152,581,722 | $2.32 | $6,087,802 | 65,729,675 MONA | -3.20% | 60 | THR ThoreCoin | $148,663,548 | $1714.97 | $179,021 | 86,686 THR | -5.74% | 61 | IOST IOST | $146,115,831 | $0.012162 | $31,901,773 | 12,013,965,609 IOST | -2.13% | 62 | ICX ICON | $142,531,341 | $0.301076 | $10,890,859 | 473,406,688 ICX | -4.32% | 63 | DGB DigiByte | $138,427,137 | $0.011541 | $1,186,651 | 11,994,056,188 DGB | -5.09% | 64 | ZIL Zilliqa | $136,925,993 | $0.015762 | $13,494,196 | 8,687,360,058 ZIL | -2.40% | 65 | KCS KuCoin Shares | $129,815,988 | $1.45 | $26,079,874 | 89,659,415 KCS | -2.98% | 66 | LAMB Lambda | $125,711,992 | $0.251424 | $36,282,311 | 500,000,000 LAMB | -1.68% | 67 | XIN Mixin | $125,658,873 | $277.73 | $887,298 | 452,447 XIN | -5.58% | 68 | ABBC ABBC Coin | $125,028,718 | $0.247542 | $83,030,205 | 505,080,602 ABBC | -5.76% | 69 | SC Siacoin | $121,947,079 | $0.002949 | $2,043,076 | 41,353,612,700 SC | -4.06% | 70 | GXC GXChain | $121,693,968 | $2.03 | $3,042,648 | 60,000,000 GXC | -4.99% | 71 | AE Aeternity | $121,162,277 | $0.444154 | $38,500,380 | 272,793,174 AE | -4.23% | 72 | XVG Verge | $116,468,172 | $0.007369 | $1,604,278 | 15,805,409,499 XVG | -3.23% | 73 | ETP Metaverse ETP | $116,154,120 | $1.62 | $40,461,666 | 71,759,885 ETP | -9.33% | 74 | STEEM Steem | $110,124,952 | $0.340978 | $1,122,121 | 322,967,892 STEEM | -2.49% | 75 | ARDR Ardor | $109,691,634 | $0.109801 | $1,327,459 | 998,999,495 ARDR | -5.31% | 76 | ELF aelf | $108,074,954 | $0.217880 | $15,735,213 | 496,030,000 ELF | -1.35% | 77 | INB Insight Chain | $107,508,460 | $0.307252 | $5,435,619 | 349,902,689 INB | -6.46% | 78 | SOLVE SOLVE | $107,021,001 | $0.327169 | $9,286,887 | 327,112,052 SOLVE | 12.73% | 79 | VEST VestChain | $105,387,369 | $0.014889 | $396,465 | 7,078,400,000 VEST | -11.00% | 80 | QNT Quant | $101,347,141 | $10.37 | $13,719,402 | 9,777,236 QNT | 14.61% | 81 | NEX Nash Exchange | $99,192,791 | $2.74 | $1,985,748 | 36,196,678 NEX | -3.50% | 82 | THETA THETA | $98,543,466 | $0.113203 | $2,157,896 | 870,502,690 THETA | -5.94% | 83 | DENT Dent | $96,923,187 | $0.001332 | $6,458,186 | 72,745,838,994 DENT | -5.83% | 84 | WTC Waltonchain | $96,854,881 | $2.32 | $22,574,334 | 41,682,339 WTC | 16.17% | 85 | MCO Crypto.com | $93,731,944 | $5.93 | $8,346,732 | 15,793,831 MCO | -2.05% | 86 | MAID MaidSafeCoin | $93,684,812 | $0.207014 | $685,687 | 452,552,412 MAID | -6.19% | 87 | SNT Status | $91,353,836 | $0.026323 | $17,797,371 | 3,470,483,788 SNT | -3.86% | 88 | ENJ Enjin Coin | $89,470,435 | $0.115345 | $4,932,359 | 775,679,781 ENJ | -2.92% | 89 | EKT EDUCare | $89,461,163 | $0.124975 | $2,309,456 | 715,835,137 EKT | -2.38% | 90 | DAI Dai | $88,618,335 | $0.982690 | $18,747,792 | 90,179,367 DAI | -0.80% | 91 | GNT Golem | $88,468,793 | $0.091730 | $1,127,506 | 964,450,000 GNT | -3.47% | 92 | XZC Zcoin | $86,935,432 | $11.06 | $2,154,223 | 7,861,468 XZC | -4.14% | 93 | NAS Nebulas | $83,446,607 | $1.72 | $10,626,245 | 48,627,715 NAS | 4.07% | 94 | STRAT Stratis | $81,564,126 | $0.820611 | $3,546,613 | 99,394,330 STRAT | -7.04% | 95 | NET NEXT | $78,533,755 | $1.56 | $12,188,679 | 50,269,268 NET | 22.71% | 96 | REN Ren | $77,607,280 | $0.100819 | $8,913,932 | 769,764,831 REN | -8.98% | 97 | CCCX Clipper Coin | $74,987,003 | $0.019861 | $56,719 | 3,775,570,996 CCCX | 13.61% | 98 | MXM Maximine Coin | $70,357,334 | $0.042667 | $2,647,414 | 1,649,000,000 MXM | -2.66% | 99 | WAX WAX | $69,037,228 | $0.073224 | $548,672 | 942,821,662 WAX | -5.35% | 100 | SAN Santiment Net... | $69,026,823 | $1.10 | $21,347 | 62,660,371 SAN | -2.61% |
Using attribute and class selectors, you can easily scrape the table: import requests from bs4 import BeautifulSoup def make_soup(url: str) -> BeautifulSoup: res = requests.get(url, headers={ 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0' }) res.raise_for_status() return BeautifulSoup(res.text, 'html.parser') def scrape_coins(soup: BeautifulSoup) -> list: table = soup.select_one('#currencies') coins = {} for row in table.select('tbody > tr'): symbol = row.select_one('.currency-symbol').text.strip() name = row.select_one('.currency-name-container').text.strip() cap = row.select_one('.market-cap')['data-usd'] price = row.select_one('.price')['data-usd'] volume = row.select_one('.volume')['data-usd'] supply = row.select_one('[data-supply]')['data-supply'] change = row.select_one('[data-percentusd]')['data-percentusd'] coins[symbol] = { 'name': name, 'cap': float(cap), 'price': float(price), 'volume': float(volume), 'supply': float(supply), 'change': float(change), } return coins if __name__ == "__main__": url = 'https://coinmarketcap.com/' soup = make_soup(url) info = scrape_coins(soup) from pprint import pprint pprint(info) output: {'BTC': {'cap': 196969226244.0, 'change': -5.4235, 'name': 'Bitcoin', 'price': 11065.3915833, 'supply': 17800475.0, 'volume': 24574484943.9}, 'ETH': {'cap': 30724660168.6, 'change': -2.00031, 'name': 'Ethereum', 'price': 287.746701554, 'supply': 106776758.874, 'volume': 8840470261.58}, 'LTC': {'cap': 7439287857.04, 'change': -3.64038, 'name': 'Litecoin', 'price': 118.965428838, 'supply': 62533190.774, 'volume': 4181083872.28}, 'XRP': {'cap': 16149651071.9, 'change': -4.05122, 'name': 'XRP', 'price': 0.379397286226, 'supply': 42566596173.0, 'volume': 1332204345.98}} ... and so on