ML-Powered Data Warehouse for SERP Analytics
Monitoring website position inside global and local search engines.
15 Jun 2020
6 min read
Read independent verified review on Clutch.co
Summary
- Modern approaches for monitoring the site position inside global and local search engines require huge amounts of different textual queries.
- A leading digital marketing agency wanted a scalable and automated service, dedicated to performing search engine optimization analysis.
- The agency used such engine for day-to-day and long term analytics and monitoring of performance of SEO optimized sites.
- Natural Language Processing and other Machine Learning methods were the foundations of the solution implemented by our team.
Tech Stack
Akka
Apache Spark
Cassandra
GlusterFS
Kafka
PostgreSQL
Python
Scala
TensorFlow
Next Case
CV-Powered Personal Coach Platform for Amateur Athletes
AI-Powered approach in dance education.
Timeline
2 Weeks
Data Gathering Parser
Data Engineer
1 Week
Solution Architecture Design
Solution Architect
1 Week
Feature Extraction Pipeline Development
Deep Learning Engineer
Deep Learning Researcher
1 Week
Development of Customised LSH
Deep Learning Researcher
Deep Learning Engineer
1 Week
Clustering Performance Optimization
Deep Learning Engineer
Data Engineer
2 Weeks
Data Warehouse Configuration
Data Engineer
6 Weeks
Web Platform Development
Backend Developer
Frontend Developer
2 Weeks
Integration & Deployment
Backend Developer
Dev Ops
Tech Challenge
- Implementation required removal of the laborious manual tasks from the SEO team, allowing the client to considerably improve quality and revenues of the services.
- It was important that the entire range of analytics, from days to years, is in full disposal of the SEO specialist to adjust the parameters and predict the outcome.
- Parsing of Google Search Console of the websites and then parse google for search queries results taken from GSC.
- Clustering query-results matrix by links where the size of the matrix could be tens of millions squared.
Solution
- Peek queries are formed automatically by our own Natural Language Processing algorithm, applying modern approach to monitoring the site position inside global and local search engines.
- This algorithm considers structure and content of the target site pages and builds huge amounts of different textual queries to get the whole picture of the site’s performance.
- Those queries are performed and stored inside the database on a daily basis for whole range of sites. Each site is then analyzed against the competition, using the tool we have built.
- Different ranges of analytics on day-to-years scale are accessible to the SEO specialist for further iterations.
- Customized and optimized Apache Spark based LSH (Locality Sensitive Hashing) for approaching near linear clustering complexity - O(bn). Avgerage clustering time for 10M x 10M matrix takes near 15min.
Impact
- Our team has built a scalable service, which performs search engine optimization analysis.
- It is used for daily as well as long-term analytics and monitoring the performance of the SEO optimized sites.
- By eliminating the need for manual creation of SEO queries, our solutions has saved the agency at least thousands of working hours allowing to allocate people resources elsewhere.
Have an idea? Let's discuss!
Book a meeting
Yuliya Sychikova
COO @ DataRoot Labs
Do you have questions related to your AI-Powered project?
Talk to Yuliya. She will make sure that all is covered. Don't waste time on googling - get all answers from relevant expert in under one hour.
Talk to Yuliya. She will make sure that all is covered. Don't waste time on googling - get all answers from relevant expert in under one hour.
OR
Important copyright notice © DataRoot Labs and datarootlabs.com, 2024. Unauthorized use and/or duplication of this material without express and written permission from this site’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to DataRoot Labs and datarootlabs.com with appropriate and specific direction to the original content.
Copyright © 2016-2024 DataRoot Labs, Inc.