Datafication & Technology

Datafication, Phantasmagoria of the 21st Century

Page 2 of 6

DATAFIED (Video presentation for the Capra Course Alumni)

DATAFIED: A Critical Exploration of the Production of Knowledge in the Age of Datafication

This presentation by Hélène Liu introduces the main findings of her PhD critical research on the profound epistemological shift that accompanies the digital age. To a large extent, civilisations can be understood by the kind of knowledge they produce, and how they go about knowing what they know.

Inspired by The Arcades Project, the seminal work of early 20th-century philosopher and social critic Walter Benjamin, “DATAFIED” asks what civilisation is emerging at the dawn of the 21st century. The spread of algorithms -based on quantified, discrete, computer-ready data bits- to all qualitative aspects of life has far-reaching consequences.

The fanfare around the novelty aspect of social media and more recently of AI obfuscates the old paradigm ideology of quantification underlying the development of those technologies. The language used since its inception anthropomorphises digital technology and conceals a fundamental difference between datafied and human ways of knowing. As we embark in a new wave of increasingly inescapable digital architectures, it has become more urgent and more crucial to critically investigate their problematic epistemological dimension.

The video begins with an introduction of Hélène Liu and is followed by her talk that concludes with pointers toward a more regenerative ecology of knowing deeply inspired by the knowledge and insights shared during the Capra course (capracourse.net). After her presentation we hear reactions and reflections by Fritjof Capra, the teacher of the Capra Course and co-author of The Systems View of Life.

Presenter: Hélène Liu 
Helene holds Masters degrees from the Institut d’Etudes Politiques de Paris-University of Paris (Economics and Finance), the University of Hong Kong (Buddhist Studies) and a PhD from the School of Design at the Polytechnic University of Hong Kong. She is a long-term meditator and student of Vajrayana Buddhism. She recently produced and is releasing her first music album, The Guru Project (open.spotify.com/artist/3JuD6YwXidv7Y2i1mBakGY), which emerged from a concern about the divisiveness of the algorithmic civilisation. The album brings together the universal language of mantras with music from a diversity of geographies and genres, as a call to focus on our similarities rather than our differences.

NB: The link to the Vimeo is https://vimeo.com/839319910

How Data Companies Get Your Data

In the early days of the commercial internet, websites just used cookies to track.

Today, tracking has become much deeper and more sophisticated. The advertising tech industry has developed new ways to track users. NB: what I call the advertising tech industry is basically Facebook, Google and a myriad of data brokers or “consumer intelligence” (nice euphemism) companies.

The key to programmatic advertising, this entire ad industry upon which the entire internet as we know it today runs off of, is IDENTITY.

It monitors and scoops up all the information about everything people do on the internet, i.e., all the big data. But then they need to associate actions and data and insight with individual identities (if you visit YouTube everyday from your home, office, cafe and gym and also do so from your laptop, mobile, tablets and computer, those are all disparate tidbits of info that need to be unified and linked under your identity for it to be valuable information).

So the key for a data industry players now is all about managing the unique user identities/profiles, each of which they will gather and add behavioral tracking and other data to.

In their systems, I have a profile, you have a profile, etc… but not a profile in the sense that we have made an account with the data company, but rather based on the dossiers they have on all of us.

Here is an interesting article listing out some key ID data companies. They are the ones that compile and maintain shadow identities of people.

The article outlines how each company creates an ID (by email address, IP address, postal address, cookies, device software/hardware information, combination of these, etc…).

Here is a graph summarising the sources contributing to building profiles.

This graph helps you to take action to protect your privacy for each of the items listed above.

  • Email:
    • For as much as a cup of coffee a day, you can subscribe to ProtonMail, the most secure email on the planet. ProtonMail even allows you to create aliases email addresses connected to your main email. So you never have to reveal your real email anymore.
    • SimpleLogin is a great solution that let you create as many email aliases as you want. I use it A LOT! It also allows you to create your own subdomain name (i.e., what comes after the “at” in an email address, something like “user.aleeas.com”). From there, you can make up an alias by placing the name of the newsletter, shopping site or delivery company (or anything else) before the “at” (something like “amazon at user.aleeas.com”). All emails will land in your inbox, but your REAL email (and therefore your identity) is protected. Check SimpleLogin documentation for more info.
  • Phone Number: you can subscribe to services that give you alias phone numbers that you can use for online purchases.
  • Name: never give your full name when you subscribe to newsletter or browse the internet. Most of the time initials will suffice.
  • Postal address: this one is harder to hide, but you can consider using a PO Box.
  • IP Address: use a reliable VPN, and remember, if you do not pay for the service, your data is your payment! So again, pay to get reliable, secure services. ProtonMail has a very good VPN. They have a package available that bundles email, VPN, aliases etc. Check their website, they often have special promotions.
  • Browser activity: use safe browsers such as TOR, Mullvad Browser or Brave.
  • Device Data: check your privacy settings, disable location services for all apps, and only enable the one that REALLY need it when you use the app.
  • First party cookies: use safe browsers such as TOR, Mullvad Browser or Brave.
  • Third-party cookies: most browsers allow you to stop third party cookies (although I would not trust any browser that belongs to a big tech company).

Check the previous posts on this blog for more privacy tips.

A Not-So-Dematerialised Internet? Undersea Cables (from “The Conversation”)

Undersea cables are the unseen backbone of the global internet.

Special ships lay data cables across the world’s oceans. Stefan Sauer/picture alliance via Getty Images

Robin Chataut, Quinnipiac University

Have you ever wondered how an email sent from New York arrives in Sydney in mere seconds, or how you can video chat with someone on the other side of the globe with barely a hint of delay? Behind these everyday miracles lies an unseen, sprawling web of undersea cables, quietly powering the instant global communications that people have come to rely on.

Undersea cables, also known as submarine communications cables, are fiber-optic cables laid on the ocean floor and used to transmit data between continents. These cables are the backbone of the global internet, carrying the bulk of international communications, including email, webpages and video calls. More than 95% of all the data that moves around the world goes through these undersea cables.

These cables are capable of transmitting multiple terabits of data per second, offering the fastest and most reliable method of data transfer available today. A terabit per second is fast enough to transmit about a dozen two-hour, 4K HD movies in an instant. Just one of these cables can handle millions of people watching videos or sending messages simultaneously without slowing down.

About 485 undersea cables totaling over 900,000 miles sit on the the ocean floor. These cables span the Atlantic and Pacific oceans, as well as strategic passages such as the Suez Canal and isolated areas within oceans.

a map of the world showing many lines connecting the continents
Undersea cables tie the world together. TeleGeography, CC BY-SA

Laying cables under the sea

Each undersea cable contains multiple optical fibers, thin strands of glass or plastic that use light signals to carry vast amounts of data over long distances with minimal loss. The fibers are bundled and encased in protective layers designed to withstand the harsh undersea environment, including pressure, wear and potential damage from fishing activities or ship anchors. The cables are typically as wide as a garden hose.

The process of laying undersea cables starts with thorough seabed surveys to chart a map in order to avoid natural hazards and minimize environmental impact. Following this step, cable-laying ships equipped with giant spools of fiber-optic cable navigate the predetermined route.

As the ship moves, the cable is unspooled and carefully laid on the ocean floor. The cable is sometimes buried in seabed sediments in shallow waters for protection against fishing activities, anchors and natural events. In deeper areas, the cables are laid directly on the seabed.

Along the route, repeaters are installed at intervals to amplify the optical signal and ensure data can travel long distances without degradation. This entire process can take months or even years, depending on the length and complexity of the cable route. https://www.youtube.com/embed/yd1JhZzoS6A?wmode=transparent&start=0 How undersea cables are installed.

Threats to undersea cables

Each year, an estimated 100 to 150 undersea cables are cut, primarily accidentally by fishing equipment or anchors. However, the potential for sabotage, particularly by nation-states, is a growing concern. These cables, crucial for global connectivity and owned by consortia of internet and telecom companies, often lie in isolated but publicly known locations, making them easy targets for hostile actions.

The vulnerability was highlighted by unexplained failures in multiple cables off the coast of West Africa on March 14, 2024, which led to significant internet disruptions affecting at least 10 nations. Several cable failures in the Baltic Sea in 2023 raised suspicions of sabotage.

The strategic Red Sea corridor has emerged as a focal point for undersea cable threats. A notable incident involved the attack on the cargo ship Rubymar by Houthi rebels. The subsequent damage to undersea cables from the ship’s anchor not only disrupted a significant portion of internet traffic between Asia and Europe but also highlighted the complex interplay between geopolitical conflicts and the security of global internet infrastructure.

Protecting the cables

Undersea cables are protected in several ways, starting with strategic route planning to avoid known hazards and areas of geopolitical tension. The cables are constructed with sturdy materials, including steel armor, to withstand harsh ocean conditions and accidental impacts.

Beyond these measures, experts have proposed establishing “cable protection zones” to limit high-risk activities near cables. Some have suggested amending international laws around cables to deter foreign sabotage and developing treaties that would make such interference illegal.

The recent Red Sea incident shows that help for these connectivity challenges might lie above rather than below. After cables were compromised in the region, satellite operators used their networks to reroute internet traffic. Undersea cables are likely to continue carrying the vast majority of the world’s internet traffic for the foreseeable future, but a blended approach that uses both undersea cables and satellites could provide a measure of protection against cable cuts.

Robin Chataut, Assistant Professor of Cybersecurity and Computer Science, Quinnipiac University

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Regulating Big Tech

Read this article by Joseph Stiglitz (see bio below) in Project Syndicate about the nascent steps to protect data privacy in the US. In February 2024, the Biden administration published an executive order to ban the transfer of “certain types “sensitive personal” data to some countries.

This is a drop in the ocean and the US is way behind in terms of protecting their citizens’ data from being exploited by the players in the data economy (compared to the EU for example). However, it is probably the beginning of a trend toward increased protection against a predatory system that has created too many anti competitive practices and social harms to be listed here. Admittedly, the US is walking on eggshells because regulating the digital seems directly at odd with the US competitive advantage in this domain.

The firms that make money from our data (including personal medical, financial, and geolocation information) have spent years trying to equate “free flows of data” with free speech. They will try to frame any Biden administration public-interest protections as an effort to shut down access to news websites, cripple the internet, and empower authoritarians. That is nonsense.

Over the past 20-25 years, the narrative about digital technology has been consistently driven by Big Tech to hide the full extent of what was really happening. The idealistic beliefs of democratisation, equality, friendship, connection from the early internet served as a smokescreen to the development of a behemoth fundamentally exploitative data industry that pervades all areas of the economy and society.

Today, large tech monopolies use indirect ways to try to quash attempts to change the status quo and counter Big Tech abuses.

Tech companies know that if there is an open, democratic debate, consumers’ concerns about digital safeguards will easily trump concerns about their profit margins. Industry lobbyists thus have been busy trying to short-circuit the democratic process. One of their methods is to press for obscure trade provisions aimed at circumscribing what the United States and other countries can do to protect personal data.

The article details previous attempts to ban any possible provisions preventing executive and congressional power over data regulation and establish special clauses in trade pacts to grant secrecy rights (an ironic state of affairs considering that the early Internet was developed on exactly opposite values). It is important to realise that most efforts are spent on surreptitious (INDIRECT) ways to limit any possibility of regulation through trade agreements for example, what Stiglitz calls “Big Tech’s favoured “digital trade” handcuffs“.

Stiglitz concluding remark reminds us that the stakes are high: ultimately, the choices made today have the potential to impact the democratic order.

Whatever one’s position on the regulation of Big Tech – whether one believes that its anti-competitive practices and social harms should be restricted or not – anyone who believes in democracy should applaud the Biden administration for its refusal to put the cart before the horse. The US, like other countries, should decide its digital policy democratically. If that happens, I suspect the outcome will be a far cry from what Big Tech and its lobbyists were pushing for.

Joseph E. Stiglitz, a Nobel laureate in economics and University Professor at Columbia University, is a former chief economist of the World Bank (1997-2000), chair of the US President’s Council of Economic Advisers, and co-chair of the High-Level Commission on Carbon Prices. He is Co-Chair of the Independent Commission for the Reform of International Corporate Taxation and was lead author of the 1995 IPCC Climate Assessment.

Privacy Guides – Restore Your Online Privacy

Privacy Guides is a cybersecurity resources and privacy-focused tools to protect yourself online.

Start your privacy journey here. Learn why privacy matters, the difference between Privacy, Secrecy, Anonymity and Security and how to determine what is the threat model that corresponds best to your needs.

For example, here are some examples of threats. You may want to protect from some but don’t care much about others.

  • Anonymity – Shielding your online activity from your real identity, protecting you from people who are trying to uncover your identity specifically.
  • Targeted Attacks – Being protected from hackers or other malicious actors who are trying to gain access to your data or devices specifically.
  • Passive Attacks – Being protected from things like malware, data breaches, and other attacks that are made against many people at once.
  • Service Providers – Protecting your data from service providers (e.g. with E2EE, which renders your data unreadable to the server).
  • Mass Surveillance – Protection from government agencies, organisations, websites, and services which work together to track your activities.
  • Surveillance Capitalism – Protecting yourself from big advertising networks, like Google and Facebook, as well as a myriad of other third-party data collectors.
  • Public Exposure – Limiting the information about you that is accessible online—to search engines or the general public.
  • Censorship – Avoiding censored access to information or being censored yourself when speaking online.

Here, you can read about Privacy Guides recommendations for a whole range of online privacy tools, from browsers to service providers (cloud storage, email services, email aliasing services, payment, hosting, photo management, VPNs etc), softwares (sync, data redaction, encryption, files sharing, authentication tools, password managers, productivity tools, communication such as messaging platforms etc) and operating systems.

You can also understand some common misconceptions about online privacy (think: “VPN makes my browsing more secure”, “open source is always secure” or “complicated is better” amongst others).

You can also find valuable information about account creation: what happens when you create an account, understanding Terms of Services and Privacy Policies, how to secure an account (password managers, authentication software, email aliases etc). And just as important (maybe more), about account deletion (we leave A LOT of traces in the course of our digital life, and it’s important to become aware of what they are and how to reduce their number).

AND MUCH MORE!

I can’t recommend this website enough. Visit it, revisit it, bookmark it and share it with friends and enemies. 🙂

[HOW TO] Mitigating Tracking

This is from an exchange with a privacy and security expert friend. I am publishing his replies to my questions “as is” (no editing).

Many people ask me about tracking. What is it? Can we prevent it?

Meta/FB pixel and Google Analytics are the two most pervasive tracking tools that follow people all around the web. Vast majority of sites have either or both running silently in the background. And each can see down to the most minute detail everything a user does on a website – every link or page that gets clicked or accessed, your mouse movements, the data you enter into every form or text box or search bar, the credentials you input to sign up or register for a service, the time you spend viewing a certain piece of content on the site, and countless other things etc… (visit deviceinfo.me to see example of all the little things a site can track and recognise about your computer).

And then all that data gets recorded and associated with your identity, on either a 100% precise “deterministic” basis (meaning FB or Google know you personally are the user), or on a “probabilistic” basis (when they don’t know for a fact it is you but can infer that it is likely you based on a range of clues/patterns).

Tracking is deterministic for most internet users (i.e. those not taking precautions to prevent and block tracking). Tracking is probabilistic for the small segment that actively try to mitigate against the tracking with various techniques (someone like me).

The goal for someone who cares and is operating in the probabilistic bucket is to actively thwart the tracking to the extent where FB/Google is unable to, with a good degree of confidence, link your identity to the given activity.

But there is otherwise no way to 100% prevent such tracking, to fully escape all deterministic and probabilistic tracking of your activity, other than not owning digital devices and never accessing the internet.

The most basic + doable + minimal pain actions to take to move oneself away from being in the deterministic bucket and into the probabilistic category are:

  1. Practice “browser isolation“, meaning use one browser exclusively for Facebook/meta/Instagram + Google/Gmail things, and for nothing else. And then use another separate browser for all your other non-FB/Google internet activity. Key is to make sure you NEVER sign into your FB/Google/Gmail accounts on your non-FB/Google browser (as the moment this happens, FB/Google are able to immediately link that browser and all its future activity to your personal identity).
  1. Do NOT use Google Chrome Web browser as your non-FB/Google browser. Use Firefox or Brave Browser instead. And again, NEVER log into any FB/Google account on your Firefox/brave browser (and try to avoid as much as possible even visiting any FB/Google products or websites on that browser).
  1. Install and activate the browser extension uBlock Origin into your non-FB/Google browser.
  1. Do not use Google Search in your non-FB/Google browser, and don’t go to Google to make searches. Use privacy alternatives like DuckDuckGo (www.duckduckgo.com) or Brave Search. This preference can be toggled in the browser settings.

Of course one of the most effective actions is to fully delete your accounts with and entirely avoid using any Facebook/Meta + Google products/services, but this is too big a jump for most people and still doesn’t mitigate the tracking 100% (as even without a formal account on FB/Google, without further mitigations in place, they are still able to identify you as a unique user and track you using their created “shadow profile”).

All of this is only basic tracking mitigation for standard desktop web browser activity (i.e. just visiting websites on your computer). The many other ways our digital behaviour is tracked require their own other set of mitigations, so this only covers one part of it, but is an effective and easy start.

Can you outline a complete strategy to mitigate tracking?

I’d say overall there are a few key domains to look at:

  • Web browsing (basic mitigation as above).
  • Mobile devices because these are one of the biggest sources of privacy leakage in most people’s lives (mitigation being switching to a de-googled android device instead of iPhone or regular android + limiting installed apps to only vital ones).
  • Social media for obvious reasons (deleting and avoiding social media, or at least Facebook or generally be sparing in use and minimise data consciously shared on platform).
  • Email because all email on traditional providers is not private, all content can be and is actively read and analysed by provider (migrate away from Gmail, outlook, yahoo, apple etc and move to trustworthy privacy respecting email providers like protonmail or tutanota).
  • Cloud storage services, for the same reason as email (migrate away from Dropbox/other big tech cloud storage providers, also move to privacy friendly ones like proton).
  • Communications, because normal communications are either not private or secure or both (try to use Signal www.signal.org over WhatsApp, try to use Signal call/message over regular phone call or SMS, even WhatsApp is better for voice calls/messaging compared to traditional phone call/SMS as at least it is end to end encrypted).
  • Use unique account credentials for each of your online accounts, with different complex password for each. Avoid using the same password (or the same password with only minor variations) for all services (more for general security but still important as cannot have privacy without security, for basic use recommend Bitwarden www.bitwarden.com with a very strong master password that you keep close guard over).
  • Use multi-factor or two-factor (MFA or 2FA) authentication to secure accounts wherever possible (ideally use TOTP time based codes via an app like Aegis or enteAuthenticator).

NB: The links above are clean (i.e., not affiliated links), I do not get any reward when you subscribe to those services.

Leaving Traces Online, Identifiers.

Visit this website (or copy and paste https://www.deviceinfo.me) and it will show you a long list of all the identifiers that every website you visit can find out about you, your location, your device etc… All these different data points then used to create a “fingerprint” of your web browser, allowing the rest of your web activity on that same browser/device to be trackable.

NB: You can visit this website from any of your devices (mobile or desktop/laptop).

A Lost Civilisation

My PhD research on the epistemological shifts that accompany the rise of the digital age has alerted me to the high value the digital civilisation puts on turning the non- quantifiable aspects of our lives and our experience into quantified, computer-ready data. I have become more aware of the numerous deleterious consequences of this phenomenon.

Qualitative dimensions of life such as emotions, relationships or intuitive perceptions (to name a few) which draw upon a rich array of tacit knowing/feeling/sensing common to all of life are undermined and depreciated. In many areas of decision-making at the individual and collective levels, the alleged neutrality of the process of digital quantification is put forward as an antidote to the biases of the human mind, unreliable as it is encumbered by emotions and prejudices. While certain areas of the economy lend themselves to quantitative measurement, most crucial aspects of the experience of living do not.

There is a logical fallacy in the belief that digital data are neutral. They are produced in and by social, cultural, economic, historical (etc) contexts and consequently carry the very biases present in those contexts. There is a logical fallacy in the belief that algorithms are neutral. They are highly designed to optimise certain outcomes and fulfil certain agendas which, more often than not, do not align with the greater good.

Far from being a revolution, the blind ideological faith in digital data is directly inherited from the statistical and mechanistic mindset of the Industrial Revolution and supports the positivist view that all behaviours and sociality can be turned into hard data. The enterprise of eradicating uncertainty and ambiguity under the guise of so-called scientific measurement is such an appealing proposition for so many economic actors that we have come to forget what makes us human.

A civilisation that has devalued and forgotten the humanness of being human is a lost civilisation.

UK Police to Double The Use of Facial Recognition (The Guardian) & Fawkes

This is an article published by The Guardian on October 29, 2023.

https://www.theguardian.com/technology/2023/oct/29/uk-police-urged-to-double-use-of-facial-recognition-software

The UK policing minister encourages police department throughout the country to drastically increase the use of facial recognition software, and include passport photos into the AI database of recognisable images.

Excerpts:

Policing minister Chris Philp has written to force leaders suggesting the target of exceeding 200,000 searches of still images against the police national database by May using facial recognition technology.”

He also is encouraging police to operate live facial recognition (LFR) cameras more widely, before a global artificial intelligence (AI) safety summit next week at Bletchley Park in Buckinghamshire.”

Philp has also previously said he is going to make UK passport photos searchable by police. He plans to integrate data from the police national database (PND), the passport office and other national databases to help police find a match with the “click of one button”.”

If the widespread adoption of facial recognition softwares (that can recognise and identify a face even when it is partially covered) concerns you, you may want to consider using FAWKES, an image cloaking software developed at the Sand Lab at the university of Chicago.

The latest version (2022) includes compatibility with Mac 1 chips.

http://sandlab.cs.uchicago.edu/fawkes

This is what the Sand lab website says:

The SAND Lab at University of Chicago has developed Fawkes1, an algorithm and software tool (running locally on your computer) that gives individuals the ability to limit how unknown third parties can track them by building facial recognition models out of their publicly available photos. At a high level, Fawkes “poisons” models that try to learn what you look like, by putting hidden changes into your photos, and using them as Trojan horses to deliver that poison to any facial recognition models of you. Fawkes takes your personal images and makes tiny, pixel-level changes that are invisible to the human eye, in a process we call image cloaking. You can then use these “cloaked” photos as you normally would, sharing them on social media, sending them to friends, printing them or displaying them on digital devices, the same way you would any other photo. The difference, however, is that if and when someone tries to use these photos to build a facial recognition model, “cloaked” images will teach the model an highly distorted version of what makes you look like you. The cloak effect is not easily detectable by humans or machines and will not cause errors in model training. However, when someone tries to identify you by presenting an unaltered, “uncloaked” image of you (e.g. a photo taken in public) to the model, the model will fail to recognize you.

I downloaded FAWKES on my M1 MacBook, and while a bit sloe, it works perfectly. You may have to tweak your privacy and security settings (in System Settings) to allow FAWKES to run on your computer. I also recommend to use the following method to open the app the first time you use it: go to Finder > Applications > FAKWES. Right click on the app name and select “Open”.

Be a bit patient, it took a 2-3 minutes for the software to open when I first used it. And it may take a few minutes to process photos. But all in all, it is working very well. Please note that it only seems to work for M1 chip MacBook but not iMac.

« Older posts Newer posts »