
Social Media. The unseen string that control every single waking (and sleeping) moment of our lives, Whether it be watching Mr. Beast on Youtube or doom-scrolling for ours on TikTok or instagram or sending hundreds of job applications on linkedin, our lives revolve around social media. It does not mean one has to be addicted, however one cannot imagine their lives without it.
As of 2024, there are 5.22 billion social media users worldwide, accounting for 63.8% of the global population. This represents a significant increase from 4.72 billion users in January 2023, highlighting an 8% growth over the past year. Platforms which are most widely used include Youtube, Facebook, Instagram, X, etc.
Eventhough these platforms and services are used to such a great extent, the knowledge among common people as to how social media actually functions is still a sort of enigma.
This article delves into the mechanism of how social media functions. It talks about digital storing of infromation, interactions, building of algorithms and more.









DATA STORAGE-WHERE IT ALL BEGINS
How is data of billions of people safely stored and mantained by these mammoth companies?
Data storage in social media platforms involves a sophisticated infrastructure designed to handle vast amounts of dynamic, user-generated content efficiently and securely. This infrastructure employs a combination of various database technologies, cloud services, and security measures to ensure data availability, integrity, and privacy.
How does the actual storage of data happen from the time we hit send on a post to the actual storage of it?

How it all begins? Clicking send after you took a picture with your friends.

Step 1- User interaction and data packing
Whenever there is an activity on our account such as sharing a picture, the platform or app immediately collects all the information such as image, text of the caption, any filter used and the user ID and typically stores it in a JSON format.
A JSON format of data storage is exemplied as such
{
name: “Bob”
country:”France”
Phone number:XXXXXXXXXX
}
The data is then ‘wrapped’ into an HTTP/HTTPS request for transmission over the internet.
ANALOGY- Imagine writing a letter with all the details about your trip (content, photos) and sealing it in an envelope (data packet). You also add an address label (metadata) and stamp (authentication token) before handing it over for delivery.
Step 2: Transmission over the internet

Data is transferred in the form of packets. A packet is nothing but a small segment of a large message and each packet contains information about the data in the form of 1s and 0s. The information is called the header and it goes in frontof the packet so the reciever (receiving machine) knows what to do with the packet.
After breaking into packets, if on Wi-Fi, your request goes to your local router. If on mobile data, it connects via cell towers. The router assigns your device an IP address to identify it on the network. The router sends the packets to the ISP which routes the data packets to the destination. In this case to the social media platform’s server. It makes multiple hops with DNS servers till it reaches its destination.
This approach is intentional to prevent any single connection from monopolizing the network. Without packet switching, sending data between computers all at once would tie up multiple cables, routers, and switches for several minutes per connection. This would limit the Internet to just two users at a time, rather than accommodating the virtually unlimited number of users it supports today.
For more information on how the internet works-
https://www.cloudflare.com/en-gb/learning/network-layer/how-does-the-internet-work/
The data finally reaches the CDN or the content delivery network of the social media such as Cloudflare, Akamai. Social medias use CDNs which acts like caches and these local servers help not only transfer the data further but also process the data partially as to reduce latency.
ANALOGY-Your letter goes to the local post office (router), then travels through a network of trucks and planes (internet backbone), passing through sorting hubs (routers) until it reaches the final post office near the recipient (server).
Step 3: Server processing
What the server does when it recieves the information
First of all what is a server? A server is a program or device which provides functionality for other client devices.
Social media servers are located in data centers around the world, with major companies like Meta (which owns Facebook and Instagram) having facilities in the United States, Europe, and Asia to ensure quick access and data redundancy across different regions
Upon collecting data social media companies use something called a load balancer server which enables distribution of data traffic across the resources so no servers are overwhelmed. The server validates the information according to the guidelines of the post for example is it identified as spam or does it contain mailicious content. AI/ML models may scan images or text for offensive content.The server parses the data and adds a UID (Unique Identity) to the data.
ANALOGY-The letter is opened at a sorting facility (server). The contents are checked for validity (spam or policy violations), categorized (metadata and tags added), and prepped for delivery by assigning a tracking number (unique identifier).

STEP 4: STORAGE
Now the data has been sent by the user, delivered to the server and modified and processed according to the code of the platform. Now it is ready to be stored.
Data storage by large social media companies involves a highly sophisticated system of technologies for example databases, cloud services like Azure and security measures.
What is SQL- SQL stands for Structures Query Language, a standardized programming language that allows users to access, add, modify, and delete data in relational databases.
A databse is a collection of information stored digitally on the cloud. They may be very simple or very complex. Virtually all relational databases use structured query language (SQL) to add, update, query, and delete data stored in a relational database. Text, captions, and metadata are stored in structured relational databases like MySQL or PostgreSQL. These databases allow for quick querying when someone searches for posts or views your profile.
Object storing system is another method like Amazon S3 which breaks down the information and stores them across multiple servers for scalability and redundancy.
Data Replication- Cross-data center replication refers to the methods employed to replicate and synchronize data across multiple data centers situated in different locations. The goal is to ensure you have identical copies of data in various locations. Data replication may happen over various networks, such as local area networks (LAN), storage area networks (SAN), wide area networks (WAN), or in the cloud. This onvoles transferring data from a primary server to one or more targets. There are many benefits to replication-
It reduces data loss by creating copies.
There is high availability mantaining access to data is easier. Downtime does not cause any loss. It makes the platform reliable and scalable.
The database indexes your post based on user ID, hashtags, and time so it can be retrieved quickly when others view it.
ANALOGY-At a warehouse (data center), the letter is filed into a cabinet (database) for easy retrieval. Photos are stored in a specialized vault (object storage). To ensure nothing is lost, duplicates are sent to other warehouses (data replication).

Step 5: Accessing the post
When someone views your post, the platform retrieves the content from storage, applies algorithms, and delivers it via a nearby CDN.
ANALOGY- Your friend requests to see the letter (your post). The warehouse quickly finds it, prioritizes it based on importance (algorithm), and sends a scanned copy via a nearby courier service (CDN) to save time.
This covers how the data is stored and accessed by social media platforms

Data Accessibility: How Data is Retrieved
Data accessibility is a critical component of modern software systems, ensuring that stored data can be efficiently retrieved and utilized by applications and users. This process involves several key mechanisms, including the use of APIs, querying methods, and indexing strategies.
What is API
APIs are systems that enable two software components to communicate with each other using protocols. Example- The weather bureau’s system contains daily weather data and the weather app on your phone communicates with this using APIs and shows the daily updates.
It stands for Application Programming Interface. API architecture is usually explained in terms of client and server. The application sending the request is called the client, and the application sending the response is called the server. So in the weather example, the bureau’s weather database is the server, and the mobile app is the client. Social media APIs work by connecting spcial media platforms with external tools.

Step 1: The user asks for data
This action is equivalent to clicking the button which allows us to view a post for example.
The request goes to the backend database of the social media typically making use of APIs.
ANALOGY- Imagine walking into a library and handing a librarian a slip of paper that says, “I want to see all books about sunsets.” The slip is the user’s request, and the librarian is the app sending the request to the backend.
Step 2: Authorization
The platform ensures that the user is authorized to access the requested data.
What Happens:
- The backend API verifies the user’s identity using tokens (e.g., OAuth 2.0 access tokens).
- It checks permissions to ensure the user can access the requested resource (e.g., ensuring private posts aren’t visible to unauthorized users).
A user trying to view a private post triggers an API response with an error if they don’t have the necessary permissions.
ANALOGY-Before giving you the books, the librarian checks your library card (authentication) and makes sure you’re allowed to borrow books from the specific section you’re requesting (authorization).

STEP 3: Querying data
Querying data from social media databases involves accessing structured or unstructured data stored in a database. The process depends on the database type (SQL or NoSQL) and the social media platform’s architecture.
Most social media platforms (e.g., Facebook, Twitter, Instagram) provide APIs that allow developers to query data. For instance:
- Facebook Graph API for querying Facebook and Instagram data.
- Twitter API for tweets and user information.
ANALOGY-The librarian goes into the library’s storage areas. For cataloged books (structured data), they look them up in neatly organized shelves (relational database). For special items like old photographs (unstructured data), they go into a separate archive room (object storage).
sTEP 4: DATA OPTIMIZATION
Data is processed and prepared for effecient delivery,
It is compressed or resized based on the compatibility of the user’s system and request.
APIs manage the final formatting of the data payload, such as structuring it in JSON format for delivery.
Frequently accessed data might be cached at the application server or CDN level for quicker future responses.
ANALOGY- The librarian packages your requested books. If a book is too heavy, they photocopy only the relevant pages (compressing data). They also ensure the books are in a box that fits through your mailbox (formatting the data payload).
Step 5: Delivery using CDN
This process is similar to the process by which requests are sent to the database just in reverse order. It involves the use of APIs to provide te data to the internet so it can be delivered to the user. The app decodes the API’s response and then displays it on the screen.
Hence we can see posts that our firends have uploaded on the net.
ANALOGY- Instead of the librarian delivering the books themselves, they use a nearby courier hub (CDN). If the book is popular and multiple people in your area have requested it, the courier already has a copy and doesn’t need to go back to the library. When the courier delivers the books to your home, you open the package (decoding the API response) and arrange the books on your shelf to read (displaying the data in the app).

Security and Privacy
Social media platofrms are a ubiquitious feature of everyday life. They assist us in connecting with people across the globe and have fostered communities where users can share their hobbies and interests. With the rise of social media, user privacy and the ability to maintain control over personal information have become of paramount importance.
2FA- Two factor authorization offers an additional layer of security to our accouns. A code is sent to a mobile phone for verificiation. Even if our password is leaked out account remains protected.
Privacy check up tools
Popular social media platforms like Instagram offer built in tools in the settings which enable users to manage their personal data. We can choose which third party apps get access to our accounts.
Ghost Mode- also called incognito, this allows users to browse profiles and content without leaving a trace or footprint. This allows users to be anonymous on the web. For example when ghost mode is enabled on snapchat our location is not visible.

META
Meta collects personal data for demographic statistics but analyzes the behaviour on the platform including interactions. Meta records interests based on pages users engage with.
Meta ensures the protection of this data by muti factor authentication. The data is transmitted through encrypted channels to prevent unauthorized access.
Youtube and google
Youtube and other google platforms have strict policies around data collection and sharing. Not only do they provide 2FA, but they also mantain detailed activity logs. Google has a ‘Google Safe Browsing’ feature. This protects users from malicious sites. This is vital as sites lacking necessary security measures can leave users vulnerable. When buisnesses advertise on youtube and Google, they have an extra layer of protection in the shape of brand safery controls and exclusions.
Tiktok
Tiktok has been very scrupulous regarding its approach to privacy. Despite almost being banned from the United States and the criticisms, it has implemented measures to address these concerns. It launched an under-13 experience to protect the youth from harmful content. TikTok has implemented its brand safety regulations, including the Tiktok Inventory Filter in the Ads Manager.


AI/ML and Algorithms
Today’s social media experience is heavily linked with integration of AI tech. It has revolutionized how social meida content is percieved. According to Linkedin itself AI powers everything of the platform. AI is used in social media to provide feed recommendations and content ideas to content creators also. They are a crucial part of providing the personalized experience we get when we are on a social media platform. AI models analyze user input to offer platform valuable insights.
As the user base grows and users spend more time on the app, its adveritising value increases. This increases the demand from advertisers. Thus we can say that social media compaies depend on AI to not only improve user experience but also drive financial incentives of the company.
The imapct of AI has also changed the marketing industry. 90% OF leaders in social media marketing, customer care and communications think that SMD (social media data) is vital. 95% of them recognize AI as a powerful weapon in the arsenal in optimizing data analysis