AWS Athena: All You Need To Know

0
141
aws athena logo

What is AWS Athena?

AWS Athena is a serverless service that allows you to search through the data in AWS S3 buckets by using SQL queries on the data, you can create different kinds of schemas to query on the data and run SQL queries on that data.

One of the best things about AWS Athena is that it is serverless so you only pay for what you use, meaning pricing depends upon the amount of data scanned.

Sometimes we have to analyze large sets of data to find any statistics, for example in the case of finance reports it could be profits earned, top-selling products, fastest-selling products, etc.

If we try to find all this information manually then it will be nearly impossible, then in situations like these, we can use AWS Athena to run all types of SQL queries on the data and find any statistics we want.

AWS Athena Features

There are a lot of great features which this service offers, let’s see some of them.

Serverless

The best feature about AWS Athena is that it is a serverless service which means that you don’t pay anything extra, you don’t need to set up servers and maintain them.

Everything is managed by AWS itself and you only pay for the amount of data you are scanning using the queries.

SQL Support

This is the main feature offered by AWS Athena as you can basically write all types of SQL queries so anyone having an experience with SQL will be able to use this service write away for querying any kind of data they want by putting it to Amazon S3 bucket.

User Friendly

The Amazon dashboard to use AWS Athena is very user-friendly as it allows you to directly write queries from the editor and see all the tables, and schemas that you created for the dataset.

You can even save the queries you want to use later on and export the results obtained through the query.

Step Functions Support

This is a newly added feature by Amazon which allows you to create different workflows to analyze different data sets in a sequence by using Step functions state machines.

Step functions allow you to execute multiple queries at once or in a sequence and provide a way to use the previous queried data to execute the next one.

To know more about AWS Step Functions, read here.

Machine Learning

There is also an option to include machine learning in the SQL queries in AWS Athena, which allows you to perform tasks like making predictions using the data, etc.

Scalable Performance

As you know that AWS Athena is a serverless service so you don’t need to worry about scaling the servers for performing heavy CPU-intensive queries, you can perform all sizes of queries on all sizes of data and AWS will make sure that performance is scaled by running the queries in parallel and distributing the query load.

To read about more features in detail, check out the official documentation by Amazon.

AWS Athena Pricing

Pricing for this service is pretty straightforward, let’s see with an example.

AWS Athena Pricing Chart

As of currently writing this post you are charged according to the amount of data scanned, which is $5 per TB of data, whenever you run any query, and if any data gets scanned you are charged for that size of data scanned.

AWS automatically rounds the size of scanned data to the nearest megabyte, cancelled queries are also charged if any data was scanned before canceling the request.

To know about the detailed pricing model for AWS Athena, check out the official documentation here.

Conclusion

In this article we saw how AWS Athena is a great tool to query through any kind of data set by using SQL queries and find out the important statistics, we discussed some of the features which it offers in brief and how the pricing is calculated, to know more about this service check out the official documentation here.