Featured Jobs

Latest From the Blog

Among the variety of open source relational databases, PostgreSQL is probably one of the most popular due to its functional capacities. That is why it is frequently used among all the areas of work where databases are involved. In this article, we will go through connection and usage of PostgreSQL in R. R is an open source language for statistical and graphics data analysis providing scientists, statisticians, and academics powerful tools for various manipulations. Besides, it allows creating and running emulations of the real-world data. Usually, R comes with an RStudio IDE, so that will be used while connecting and using PostgreSQL. PostgreSQL deployment in R One of the great things about R language is that it has numerous packages for almost every kind of needs. Moreover, the package library is constantly growing, as the packages are set up and developed by the community. Two main packages can be found in the library for connecting PostgreSQL in R environment: RPostgreSQL and RPostgres . Both of them provide great functionality for database interactions, the difference is only in the way of installation. The RPostgreSQL package is available on the CRAN, a Comprehensive R Archive Network, and is installed with the following command run in the IDE: install.packages('RPostgreSQL') As for the RPostgres package, it can be installed in two ways: cloning from Github and installing directly from CRAN. To install the package from Github, first, devtools and remotes packages must be installed with the commands. install.packages('devtools') install.packages(‘remotes’)   Then, for installing package, run remotes::install_github("r-dbi/RPostgres")   To install package from CRAN, the next basic command is used: install.packages(‘RPostgres’)   The difference in these two ways is that in CRAN the latest stable version of a package is stored while on Github users can find the latest development version. The truth is, RPostgreSQL and RPostgres packages have no difference in the way they connect to the PostgreSQL database. They both use a special DBI package in R that provides a wide range of methods and classes to establishing connection with DBs. Note: we used RPostgres package for establishing the connection. Establishing basic connection with the database using R The Postgres package comes with the next command: con<-dbConnect(RPostgres::Postgres())   With the following steps you can set up the connection to a specific database:   library(DBI) db <- 'DATABASE'  #provide the name of your db host_db <- ‘HOST’ #i.e. # i.e. 'ec2-54-83-201-96.compute-1.amazonaws.com'   db_port <- '98939'  # or any other port specified by the DBA db_user <- USERNAME   db_password <- ‘PASSWORD’ con <- dbConnect(RPostgres::Postgres(), dbname = db, host=host_db, port=db_port, user=db_user, password=db_password)     To check if the connection is established, we can run the dbListTables(con) function that returns the list of the tables in our database. As you can see, no tables are stored in our database, so now it’s time to create one. Working with database As we’ve already mentioned, the R language provides a great pack of simulated datasets, that can be directly used from the IDE without downloading them previously. For our examples, we will use a popular “mtcars” dataset example, which contains data from the 1974 Motor Trend magazine car road test. Let’s first add it to the database and then check whether it has appeared in our database. The basic command to add “mtcars” to our database is dbWriteTable(con, "mtcars", mtcars) But we will do a little trick, that can make our table a little bit more readable. What we’ve done, is set up the table as a dataframe in R, renamed the first column to ‘carname’ and then removed initial dataset with the rm(mtcars) command as it is stored in the variable my_data. Using the   dbWriteTable method, we can write our dataframe to a PostgreSQL table. Then, let’s check how our table looks. Having a table in the database, we can now explore queries. For working with queries, two basic methods are needed: dbGetQuery and dbSendQuery The dbGetQuery method returns all the query results in a dataframe. The dbSendQuery registers the request for the data that has to be called by dbFetch for RPostgres to receive data. The dbFetch method allows setting parameters to query your data in some batches. The database table must have some primary key, basically, a unique identifier for every record in the table. Let’s assign the names of the cars in our table as a primary key using dbGetQuery method.   dbGetQuery(con, 'ALTER TABLE cars ADD CONSTRAINT cars_pk PRIMARY KEY ("carname")')   We have already used the dbReadTable method, but let’s return to it for a little bit to clarify the way it works. The dbReadTable method returns an overview of the data stored in the database and basically does the same function as dbGetQuery(con, ‘SELECT * FROM cars’) method. It should be noted that after using dbSendQuery requests, the dbClearResult method must be called, to remove any pending queries from the database to the current working environment. The dbGetQuery method does this by default and therefore there is no need to call dbClearResult after the execution. Creating basic queries The way of creating queries for a customized data table is basically the same as in SQL. The only difference is that the results of queries in R are stored as a variable. First, we extracted the query with the needed data from our cars table to a new variable. Then, we fetched it to the resulting variable, from which we can create a new table in our database and analyze the output of our query. Finally, the connection must be closed with the dbDisconnect(con) method. Conclusion In this article, we tried to cover the basis of connecting and using PostgreSQL in the R environment. Knowing the essentials of the SQL syntax, querying and modifying data in R is enough to connect to any standard database.. Nevertheless, we suggest reading through the package documentation, which will give you more insights on how to query data from PostgreSQL to the R environment.
Introduction Exploratory data analysis (EDA) is an approach to data analysis to summarize the main characteristics of data. It can be performed using various methods, among which data visualization takes a great place. The idea of EDA is to recognize what information can data give us beyond the formal modeling or hypothesis testing task. In other words, if initially we don’t have at all or there are not enough priori ideas about the pattern and nature of the relationships within the data, an exploratory data analysis comes for help allowing us to identify main tendencies, properties, and nature of the information. In return, based on the information obtained, the researcher will be able to evaluate the structure and nature of the available data, which can ease the search and identification of questions and the purpose of data exploration. So, EDA is a crucial step before feature engineering and can involve a part of data preprocessing. In this tutorial, we will show you how to perform simple EDA using  Google Play Store Apps Data Set . To begin with, let’s install and load all the necessary libraries that we will need. # Remove warnings options(warn=-1) # Load libraries require(ggplot2) require(highcharter) require(dplyr) require(tidyverse) require(corrplot) require(RColorBrewer) require(xts) require(treemap) require(lubridate) Data overview The Play Store apps insights can tell developers information about the Android market. Each row of the dataset has values for the category, rating, size, and more apps characteristics. Here are the columns of our dataset: App - name of the application. Category - category of the app. Rating - application’s rating on Play Store. Reviews - number of the app’s reviews. Size - size of the app. Install - number of installs of the app. Type - whether the app is free or paid. Price - price of the app (0 if free). Content Rating - target audience of the app. Genres - genre the app belongs to. Last Updated - date the app was last updated. Current Ver - current version of the application. Android Ver - minimum Android version required to run the app. Now, let’s load data and view the first rows. For that, we use head() function: df<-read.csv("googleplaystore.csv",na.strings = c("NaN","NA","")) head(df) ## App Category ## 1 Photo Editor & Candy Camera & Grid & ScrapBook ART_AND_DESIGN ## 2 Coloring book moana ART_AND_DESIGN ## 3 U Launcher Lite â\200“ FREE Live Cool Themes, Hide Apps ART_AND_DESIGN ## 4 Sketch - Draw & Paint ART_AND_DESIGN ## 5 Pixel Draw - Number Art Coloring Book ART_AND_DESIGN ## 6 Paper flowers instructions ART_AND_DESIGN ## Rating Reviews Size Installs Type Price Content.Rating ## 1 4.1 159 19M 10,000+ Free 0 Everyone ## 2 3.9 967 14M 500,000+ Free 0 Everyone ## 3 4.7 87510 8.7M 5,000,000+ Free 0 Everyone ## 4 4.5 215644 25M 50,000,000+ Free 0 Teen ## 5 4.3 967 2.8M 100,000+ Free 0 Everyone ## 6 4.4 167 5.6M 50,000+ Free 0 Everyone ## Genres Last.Updated Current.Ver ## 1 Art & Design January 7, 2018 1.0.0 ## 2 Art & Design;Pretend Play January 15, 2018 2.0.0 ## 3 Art & Design August 1, 2018 1.2.4 ## 4 Art & Design June 8, 2018 Varies with device ## 5 Art & Design;Creativity June 20, 2018 1.1 ## 6 Art & Design March 26, 2017 1.0 ## Android.Ver ## 1 4.0.3 and up ## 2 4.0.3 and up ## 3 4.0.3 and up ## 4 4.2 and up ## 5 4.4 and up ## 6 2.3 and up It’s useful to see data format to perform analysis. Also, we can review data by columns type using str function: str(df) ## 'data.frame': 10841 obs. of 13 variables: ## $ App : Factor w/ 9660 levels "- Free Comics - Comic Apps",..: 7229 2563 8998 8113 7294 7125 8171 5589 4948 5826 ... ## $ Category : Factor w/ 34 levels "1.9","ART_AND_DESIGN",..: 2 2 2 2 2 2 2 2 2 2 ... ## $ Rating : num 4.1 3.9 4.7 4.5 4.3 4.4 3.8 4.1 4.4 4.7 ... ## $ Reviews : Factor w/ 6002 levels "0","1","10","100",..: 1183 5924 5681 1947 5924 1310 1464 3385 816 485 ... ## $ Size : Factor w/ 462 levels "1,000+","1.0M",..: 55 30 368 102 64 222 55 118 146 120 ... ## $ Installs : Factor w/ 22 levels "0","0+","1,000,000,000+",..: 8 20 13 16 11 17 17 4 4 8 ... ## $ Type : Factor w/ 3 levels "0","Free","Paid": 2 2 2 2 2 2 2 2 2 2 ... ## $ Price : Factor w/ 93 levels "$0.99","$1.00",..: 92 92 92 92 92 92 92 92 92 92 ... ## $ Content.Rating: Factor w/ 6 levels "Adults only 18+",..: 2 2 2 5 2 2 2 2 2 2 ... ## $ Genres : Factor w/ 120 levels "Action","Action;Action & Adventure",..: 10 13 10 10 12 10 10 10 10 12 ... ## $ Last.Updated : Factor w/ 1378 levels "1.0.19","April 1, 2016",..: 562 482 117 825 757 901 76 726 1317 670 ... ## $ Current.Ver : Factor w/ 2832 levels "0.0.0.2","0.0.1",..: 120 1019 465 2825 278 114 278 2392 1456 1430 ... ## $ Android.Ver : Factor w/ 33 levels "1.0 and up","1.5 and up",..: 16 16 16 19 21 9 16 19 11 16 ... As you can see, we got similar information as using head function, but here we are more concentrated on data type, rather than content. Now, we will use a function that produces summaries of the results of various model fitting functions: summary(df) ## App ## ROBLOX : 9 ## CBS Sports App - Scores, News, Stats & Watch Live: 8 ## 8 Ball Pool : 7 ## Candy Crush Saga : 7 ## Duolingo: Learn Languages Free : 7 ## ESPN : 7 ## (Other) :10796 ## Category Rating Reviews ## FAMILY :1972 Min. : 1.000 0 : 596 ## GAME :1144 1st Qu.: 4.000 1 : 272 ## TOOLS : 843 Median : 4.300 2 : 214 ## MEDICAL : 463 Mean : 4.193 3 : 175 ## BUSINESS : 460 3rd Qu.: 4.500 4 : 137 ## PRODUCTIVITY: 424 Max. :19.000 5 : 108 ## (Other) :5535 NA's :1474 (Other):9339 ## Size Installs Type Price ## Varies with device:1695 1,000,000+ :1579 0 : 1 0 :10040 ## 11M : 198 10,000,000+:1252 Free:10039 $0.99 : 148 ## 12M : 196 100,000+ :1169 Paid: 800 $2.99 : 129 ## 14M : 194 10,000+ :1054 NA's: 1 $1.99 : 73 ## 13M : 191 1,000+ : 907 $4.99 : 72 ## 15M : 184 5,000,000+ : 752 $3.99 : 63 ## (Other) :8183 (Other) :4128 (Other): 316 ## Content.Rating Genres Last.Updated ## Adults only 18+: 3 Tools : 842 August 3, 2018: 326 ## Everyone :8714 Entertainment: 623 August 2, 2018: 304 ## Everyone 10+ : 414 Education : 549 July 31, 2018 : 294 ## Mature 17+ : 499 Medical : 463 August 1, 2018: 285 ## Teen :1208 Business : 460 July 30, 2018 : 211 ## Unrated : 2 Productivity : 424 July 25, 2018 : 164 ## NA's : 1 (Other) :7480 (Other) :9257 ## Current.Ver Android.Ver ## Varies with device:1459 4.1 and up :2451 ## 1.0 : 809 4.0.3 and up :1501 ## 1.1 : 264 4.0 and up :1375 ## 1.2 : 178 Varies with device:1362 ## 2.0 : 151 4.4 and up : 980 ## (Other) :7972 (Other) :3169 ## NA's : 8 NA's : 3 NA analysis After getting acquainted with the dataset, we should analyze it on NA and duplicates. Detecting and removing such records helps to build a model with better accuracy. First, let’s analyze missing values. We can review the result as a table: sapply(df,function(x)sum(is.na(x))) ## App Category Rating Reviews Size ## 0 0 1474 0 0 ## Installs Type Price Content.Rating Genres ## 0 1 0 1 0 ## Last.Updated Current.Ver Android.Ver ## 0 8 3 Or as a chart: key value Columns with NA values Rating Current.Ver Android.Ver 0 250 500 750 1000 1250 1500 1750   As you can see, there are three columns containing missing values, and Rating column has the largest number of them. Let’s remove such values. df = na.omit(df) Duplicate records removal The next step is to check whether there are duplicates. We can check the difference between all and unique objects. distinct <- nrow(df %>% distinct()) nrow(df) - distinct ## [1] 474 After detecting duplicates, we need to remove them: df=df[!duplicated(df), ] When data is precleaned, we can begin further visual analysis. Analysis using visualization tools To start off, we will review the Category column. Let’s examine which categories are the most and the least popular: df %>% count(Category, Installs) %>% group_by(Category) %>% summarize( TotalInstalls = sum(as.numeric(Installs)) ) %>% arrange(-TotalInstalls) %>% hchart('scatter', hcaes(x = "Category", y = "TotalInstalls", size = "TotalInstalls", color = "Category")) %>% hc_add_theme(hc_theme_538()) %>% hc_title(text = "Most popular categories (# of installs)") Category TotalInstalls Most popular categories (# of installs) -2 -1 GAME SOCIAL COMMUNICATION FAMILY PRODUCTIVITY TOOLS HEALTH_AND_FITNESS BUSINESS SPORTS NEWS_AND_MAGAZINES LIFESTYLE PERSONALIZATION VIDEO_PLAYERS BOOKS_AND_REFERENCE FINANCE MEDICAL PHOTOGRAPHY SHOPPING MAPS_AND_NAVIGATION TRAVEL_AND_LOCAL FOOD_AND_DRINK DATING WEATHER EVENTS AUTO_AND_VEHICLES ART_AND_DESIGN PARENTING BEAUTY COMICS ENTERTAINMENT LIBRARIES_AND_DEMO EDUCATION HOUSE_AND_HOME 100 150 200 250 50 300 Here we can see that Game is the most popular category by installs. It’s interesting that Education has almost the lowest popularity. Moreover, Comics are also at the bottom according to popularity ratings. Now, we want to see a percentage of the apps in each category. The pie chart is not a widespread type of visual, but when you need to know the percentage, it is one of the best options. Let’s count the apps in each category and expand our color palette. freq<-table(df$Category) fr<-as.data.frame(freq) fr <- fr %>% arrange(desc(Freq)) coul = brewer.pal(12, "Paired") # We can add more tones to this palette: coul = colorRampPalette(coul)(15) op <- par(cex = 0.5) pielabels <- sprintf("%s = %3.1f%s", fr$Var1, 100*fr$Freq/sum(fr$Freq), "%") pie(fr$Freq, labels=NA, clockwise=TRUE, col=coul, border="black", radius=0.5, cex=1) legend("right",legend=pielabels,bty="n", fill=coul) We can see that Family now becomes a leader among the categories. Also, Education here has a higher percentage than Comics. Now, let’s look closer to the prices of the apps and review how many free apps are available in the Play Market. tmp <- df %>% count(Type) %>% mutate(perc = round((n /sum(n))*100)) %>% arrange(desc(perc)) hciconarray(tmp$Type, tmp$perc, size = 5) %>% hc_title(text="Percentage of paid vs. free apps")   Percentage of paid vs. free apps Free Paid As you can see, 93% of the apps are free. Let’s see the median price in each category:   df %>% filter(Type == "Paid") %>% group_by(Category) %>% summarize( Price = median(as.numeric(Price)) ) %>% arrange(-Price) %>% hchart('treemap', hcaes(x = 'Category', value = 'Price', color = 'Price')) %>% hc_add_theme(hc_theme_elementary()) %>% hc_title(text="Median price per category") %>% hc_legend(align = "left", verticalAlign = "top", layout = "vertical", x = 0, y = 100) Median price per category PARENTING PARENTING DATING DATING FINANCE FINANCE FOOD_AND_DRINK FOOD_AND_DRINK LIFESTYLE LIFESTYLE ENTERTAINMENT ENTERTAINMENT BUSINESS BUSINESS EDUCATION EDUCATION WEATHER WEATHER PRODUCTIVITY PRODUCTIVITY TRAVEL_AND_LOCAL TRAVEL_AND_LOCAL MEDICAL MEDICAL BOOKS_AND_REFERENCE BOOKS_AND_REFERENCE FAMILY FAMILY GAME GAME HEALTH_AND_FITNESS HEALTH_AND_FITNESS PHOTOGRAPHY PHOTOGRAPHY SPORTS SPORTS TOOLS TOOLS SHOPPING SHOPPING COMMUNICATION COMMUNICATION NEWS_AND_MAGAZINES NEWS_AND_MAGAZINES ART_AND_DESIGN ART_AND_DESIGN VIDEO_PLAYERS VIDEO_PLAYERS PERSONALIZATION PERSONALIZATION SOCIAL SOCIAL 0 25 50 75 NEWS_AND_MAGAZINES : 21.5 This chart is a treemap. In general, it is used to display a data hierarchy, to see summary based on two values (size and color). Therefore, we can see that Parenting category has the highest price while Personalization and Social the lowest. Now, we will build a correlation heatmap, previously performing some data preprocessing. df <- df %>% mutate( Installs = gsub("\\+", "", as.character(Installs)), Installs = as.numeric(gsub(",", "", Installs)), Size = gsub("M", "", Size), Size = ifelse(grepl("k", Size), 0, as.numeric(Size)), Rating = as.numeric(Rating), Reviews = as.numeric(Reviews), Price = as.numeric(gsub("\\$", "", as.character(Price))) )%>% filter( Type %in% c("Free", "Paid") ) extract = c("Rating","Reviews","Size","Installs","Price") df.extract = df[extract] df.extract %>% filter(is.nan(df.extract$Reviews)) %>% filter(is.na(df.extract$Size)) ## [1] Rating Reviews Size Installs Price ## <0 rows> (or 0-length row.names) df.extract = na.omit(df.extract) cor_matrix = cor(df.extract) corrplot(cor_matrix,method = "color",order = "AOE",addCoef.col = "grey") Unfortunately, there is no strong relation between columns. Also, let’s see a number of installs by content rating. tmp <- df %>% group_by(Content.Rating) %>% summarize(Total.Installs = sum(Installs)) %>% arrange(-Total.Installs) highchart() %>% hc_chart(type = "funnel") %>% hc_add_series_labels_values( labels = tmp$Content.Rating, values = tmp$Total.Installs ) %>% hc_title( text="Number of Installs by Content Rating" ) %>% hc_add_theme(hc_theme_elementary()) Number of Installs by Content Rating Everyone Everyone Teen Teen Everyone 10+ Everyone 10+ Mature 17+ Mature 17+ Adults only 18+ Adults only 18+ Unrated Unrated As you might have guessed, teens take an active part in rating Play Store. You may notice a  hc_add_theme  line in the code. It adds a theme to your chart. Highchart has an extensive list of themes, and you can choose one via this  link . One of the most popular chart types is time series which we will explore at last. Also, we will transform our date type using lubridate package. # Get number of apps by last updated date tmp <- df %>% count(Last.Updated) # Transform date column type from text to date tmp$Last.Updated<-mdy(tmp$Last.Updated) # Transform data into time series time_series <- xts( tmp$n, order.by = tmp$Last.Updated ) highchart(type = "stock") %>% hc_title(text = "Last updated date") %>% hc_subtitle(text = "Number of applications by date of last update") %>% hc_add_series(time_series) %>% hc_add_theme(hc_theme_economist()) Last updated date Number of applications by date of last update Jan '13 Jul '14 Jul '15 Jan '16 Jul '16 Jan '17 Jul '17 Jan '18 Jul '18 2013 2015 2016 2017 2018 0 50 100 150 200 250 300 350 Zoom 1m 3m 6m YTD 1y All From May 21, 2010 To Aug 8, 2018 Sunday, Aug 5, 2018 ●  Series 1:  45 Such visualization is very convenient as it contains zoom options, range slider, date filtering, and points hovering. Using this chart, we can see that the number of updates is increasing with time.   Conclusion To sum up, exploration data analysis is a powerful tool for a comprehensive analysis of a dataset. In general, we can divide EDA into the next stages: data overview, duplicate records analysis, NA analysis, and data exploration. So, starting with reviewing the data structure, columns, contents, etc., we move forward to estimating and preparing our data for further analysis. Finally, visual data exploration helps to find dependencies, distribution, and more.
In the modern world, the information flow which befalls on a person is daunting. This led to a rather abrupt change in the basic principles of data perception. Therefore visualization is becoming the main tool for presenting information. With the help of visualization, information is presented to the audience in a more accessible, clear, visual form. Properly chosen method of visualization can make it possible to structure large data arrays, schematically depict elements that are insignificant in content, and make information more comprehensible. One of the most popular languages for data processing and analysis is Python, largely due to the high speed of creating and development of the libraries which grant basically unlimited possibilities for various data processing. The same is true for data visualization libraries. In this article, we will look at the basic tools of visualizing data that are used in the Python development environment. Matplotlib Matplotlib is perhaps the most widely known Python library for data visualization. Being easy to use, it offers ample opportunities to fine tune the way data is displayed. Polar area chart The library provides main visualization algorithms, including scatter plots, line plots, histograms, bar plots, box plots, and more. It is worth noting that the library has fairly extensive documentation, that makes it comfortable enough to work with even for beginners in the sphere of data processing and visualization. Multicategorical plot One of the main advantages of this library is a well-thought hierarchical structure. The highest level is represented by the functional interface called matplotlib.pyplot , which allows users to create complex infographics with just a couple of lines of code by choosing ready-made solutions from the functions offered by the interface. Histogram The convenience of creating visualizations using matplotlib is provided not only due to the presence of a number of built-in graphic commands but also due to the rich arsenal on the configuration of standard forms. Settings include the ability to set arbitrary colors, shapes, line type or marker, line thickness, transparency level, font size and type, and so on. Seaborn Despite the wide popularity of the Matplotlib library, it has one drawback, which can become critical for some users: the low-level API and therefore, in order to create truly complex infographics, you may need to write a lot of generic code. Hexbin plot Fortunately, this problem is successfully leveled by the Seaborn library, which is a kind of high-level wrapper over Matplotlib. With its help, users are able to create colorful specific visualizations: heat maps, time series, violin charts, and much more. Seaborn heatmap Being highly customizable, Seaborn allows users wide opportunities to add unique and fancy looks to their charts in a quite a simple way with no time costs. ggplot Those users who have experience with R, probably heard about ggplot2, a powerful data visualization tool within the R programming language environment. This package is recognized as one of the best tools for graphical presentation of information. Fortunately, the extensive capabilities of this library are now available in the Python environment due to porting the package, which is available there under the name ggplot . Box plot As we mentioned earlier, the process of data visualization has a deep internal structure. In other words, the process of creating a visualization is a clearly structured system, which largely influences the way of the thoughts in the process of creating infographics. And ggplot teaches the user to think in such a structured approach, to think according to this system so that in the process of consistently building commands, the user automatically starts detecting patterns in the data. Scatter plot Moreover, the library is very flexible. Ggplot provides users with ample opportunities for customizing how data will be displayed and preprocessing datasets before they are rendered. Bokeh Despite the rich potential of the ggplot library, some users may lack interactivity. Therefore, for those who need interactive data visualization, the Bokeh library has been created. Stacked area chart Bokeh is an open-source Javascript library with client-side for Python that allows users to create flexible, powerful and beautiful visualizations for web applications. With its help, users can create both simple bar charts and complex, highly detailed interactive visualizations without writing a single line in Javascript. Please have a look at this gallery to get an idea of the interactive features of Bokeh. plotly For those who need interactive diagrams, we recommend to check out the plotly library. It is positioned primarily as an online platform , on which the users can create and publish their own visualizations. However, the library can also be used offline without uploading the visualization to the plotly server. Contour plot Due to the fact that this library is positioned by developers mostly as an autonomous product, it is constantly being refined and expanded. So, it provides users truly unlimited possibilities for data visualization, whether it’s interactive graphics or contours. You can find some examples of Plotly through the link below and have a look at the features of the library. https://plot.ly/python/ Conclusion Over the past few years, data visualization tools available to Python developers have made a significant leap forward. Many powerful packages have appeared and are expanding in every possible way, implementing quite complex ways of graphical representation of information. This allows users not only to create various infographics but also to make them truly attractive and understandable to the audience.
View all blog posts