1 of 45

Cloud Deployment; Course Recap

CSCI 344: Advanced Web Technologies

Fall 2024

2 of 45

Announcements

  1. HW5 due next Tuesday (12/10) at midnight
  2. Office hours next week:
    • Monday-Wednesday (12/9-12/11): Appointment slots from 12-2PM for individual help
    • Mon-Th (12/9-12/12): “Open office hours” from 2-3PM
  3. Take home Quiz 2 released on Wednesday morning (12/11), due Friday at midnight.
  4. All late work must be submitted by next Friday, 12/13

3 of 45

Outline

  • Quiz next week
  • Course reflection
  • Broader impacts

4 of 45

Outline

  • Quiz next week
  • Course reflection
  • Broader impacts

5 of 45

Quiz next Wednesday

Next Wednesday’s quiz will be one final JavaScript activity. Things it will test:

  • Can you add an event handler to a button?
  • Can you access data that a user typed into a form and use it?
  • Can you write a JavaScript function that queries a REST API and return the resulting object / array?
  • Can you build an HTML template (using a template literal) that outputs a JSON object in a visual form?
  • Can you loop through an array of data and add a widget for each data element to the DOM?

6 of 45

Let’s Do a Practice Activity

  • Please download the lecture files

7 of 45

Course Reflection

  • My hope for this course is that you would gain some experience with various aspects of the web stack
  • Even if you aren’t yet an expert at any one piece, you hopefully have a better sense of how the pieces fit together, and how you might build your own system (perhaps in a future projects course?)
  • Hopefully you’ve also learned a bit more about how your own computer works (system paths, software installation and management, versioning)
  • Going forward, I am confident that you will be in a better position to do online tutorials on your own and/or learn new frameworks in libraries

8 of 45

Course Reflection: Web & Internet Architecture

  • An introduction to the Internet and the web
  • High-level overview of TCP/IP and DNS
  • How a browser works
  • Some different approaches to designing systems that use web clients and web servers
  • Fundamentals of HTTP – headers, methods (GET, POST, PATCH, DELETE), statelessness
  • Principles of RESTful APIs
  • Some security considerations

9 of 45

Internet “Backbone”: Global

10 of 45

Static Web Page

Architecture

Weeks 1-6

Tutorials 2-6

Homework 2

11 of 45

Dynamic, Client-Side Architecture

Weeks 11-13, Tutorials 7-8, HW 3-4

12 of 45

Server-Side Architecture

Weeks 14-18, Tutorials 9-11, HW 5-6

13 of 45

Course Reflection: Design & Human Factors

  • Accessibility considerations
  • Design principles (contrast, proximity, alignment, repetition, color, fonts, etc.)
  • Privacy and societal considerations
  • Design patterns and dark patterns

14 of 45

Course Reflection: Client-Side Programming

HTML

  • Explored the various HTML tags and how they work.

CSS

  • Practiced using selectors (element, id, and class selectors)
  • Went over common layout approaches (flexbox, box model, media queries)
  • Experimented with using Tailwind (CSS framework) – to standardize and simplify the process of styling and customizing web pages.

15 of 45

Course Reflection: Client-Side Programming

JavaScript

  • Introduction to the language
  • Targeting and manipulating DOM elements
  • Working with the browser’s built-in Developer Tools
  • Making server requests using the browser’s fetch API
  • Working with a client-side web framework (React)
  • Implementing accessibility features

16 of 45

Course Reflection: Server-Side Programming

On the “back end,” we covered quite a few concepts as well, including...

  1. Introduction to the Python language
  2. Introduction to SQL language (using PostgreSQL)
  3. Leveraging the Flask and SQLAlchemy libraries (in order to building a REST API that interacts with data).
  4. Working with an automated testing framework

17 of 45

Course Reflection: Systems Literacy

And we also practiced some skills and ideas that made these explorations possible...

  1. Installing and working with Node.js (JavaScript) and external libraries and packages (npm install…)
  2. Installing and working with python, python dependencies, and virtual environments
  3. Some experience with git / GitHub
  4. Working on the command line
  5. Setting environment variables
  6. Modifying your system path
  7. Debugging skills (practicing how to be resourceful)

18 of 45

You’ve learned some powerful skills!

The skills we’ve been learning about in this class (and in others) give you incredible power. Let’s name those powers...

  1. You can create a website that anyone in the world can potentially access
  2. That website can gather, store, and transmit fine-grained data (much of it user-generated and potentially personally identifiable)
  3. Your web server can communicate with any server in the world to send / receive data
  4. Your website can also embed scripts from third parties – enabling all sorts of additional data collection across a host of different companies.
  5. Most importantly, your system can shape how people think, act, communicate, etc.

19 of 45

Keep Learning!

This course is a survey of many different techniques and ideas involving web technologies. But there’s a ton to learn!

Recommended next steps:

  1. Front-end developer roadmap (plus JavaScript roadmap)
  2. Back-end developer roadmap (plus Python roadmap)

20 of 45

Some things to watch out for going forward...

21 of 45

Some Thoughts for Your Consideration...

  1. Copyright & fair use
  2. Privacy / surveillance
  3. Recommendations / nudging
  4. Dark patterns
  5. Takeaways

22 of 45

1. Copyright & Fair Use

23 of 45

Copyright

  1. A person who authors creative work owns the copyright unless you’re an employee who created the work as part of your job; or as work for hire
  2. Assume that all works are protected by either copyright or trademark law unless conclusive information indicates otherwise.
  3. Just because work has been posted on the Internet or it lacks a copyright notice doesn’t mean that it’s in the public domain

24 of 45

Fair Use

Fair Use (in US copyright law) the doctrine that brief excerpts of copyright material may, under certain circumstances, be quoted verbatim for purposes such as criticism, news reporting, teaching, and research, without the need for permission from or payment to the copyright holder.

�However, it is difficult to discern whether your use of someone else’s data / content actually counts as fair use – often up to the courts.

25 of 45

Web Scraping

Web scraping refers to the extraction of data from a website. This information is collected and then exported into a format that is more useful for the user. Be it a spreadsheet or an API. Sample use cases:

  1. Covid-19 vaccination finders
  2. Google’s web crawlers / spiders
  3. Price monitoring (real estate listings, stock prices, etc.)
  4. Social media / reputation monitoring

�But web scraping isn’t necessarily legal by default, and it’s important to think through how you do it.

26 of 45

Fair Use & Web Scraping: Some Advice

  1. Use an API if one is provided, instead of scraping data.
  2. Respect the Terms of Service (ToS).
  3. Respect the rules of robots.txt.
  4. Use a reasonable crawl rate, i.e. don't bombard the site with requests. Respect the crawl-delay setting provided in robots.txt; if there's none, use a conservative crawl rate (e.g. 1 request per 10-15 seconds).
  5. Identify your web scraper or crawler with a legitimate user agent string. Create a page explaining what you're doing and why, and link back to it in your user agent string (e.g. 'MY-BOT (+https://yoursite.com/mybot.html)')

27 of 45

Fair Use & Web Scraping: Some Advice

  1. If ToS or robots.txt prevent you from crawling or scraping, ask a written permission to the owner.
  2. Don't republish your crawled or scraped data without verifying the license of the data, or without written permission from the copyright holder.
  3. If you doubt on the legality of what you're doing, don't do it. Or seek the advice of a lawyer.
  4. Don't base your whole business on data scraping. The website(s) that you scrape may eventually block you, just like what happened in Craigslist Inc. v. 3Taps Inc..

28 of 45

Case Study: Clearview AI & Facial Recognition

“The system — whose backbone is a database of more than three billion images that Clearview claims to have scraped from Facebook, YouTube, Venmo and millions of other websites — goes far beyond anything ever constructed by the United States government or Silicon Valley giants.

…allows users to potentially be able to identify every person they saw. The tool could identify activists at a protest or an attractive stranger on the subway, revealing not just their names but where they lived, what they did and whom they knew.”

28

29 of 45

Are OpenAI, Gemini, etc. Ruining the Internet?

  • Copyright infringement & theft
  • A deluge of bot-generated content

Why should we care about this?

29

30 of 45

2. Privacy / Surveillance

31 of 45

Why do we care about data privacy?

  1. Industry: Behavioral data used to manipulate user behavior in very problematic ways.
    1. We don’t know how this ecosystem works; we can’t opt out of it; and it has major social ramifications
    2. As a designer / developer, it is very easy to surveil users
  2. The State: If you do get on the wrong side of power or belong to a marginal community, your data footprint can easily be accessed
  3. Individual Rights: Most people value having some control over how and when information about them is shared.

32 of 45

Privacy by Design

Rather than considering privacy directly before the launch of a product, privacy protections for individuals are considered from the very beginning of the product / service development life cycle. Some key ideas:

  1. Privacy as the default setting (opt-in versus opt-out)
  2. Privacy embedded into design
  3. End-to-end security — full lifecycle protection
  4. Visibility and transparency (should be independently verifiable)
  5. User choice, control, and consent
  6. Data minimization (don’t store it unless you absolutely have to)
  7. Anonymizing personal data

33 of 45

What is GDPR?

The General Data Protection Regulation is a European Union privacy law that went into effect in 2018. 6 data processing principles, namely that data are:

  1. Processed fairly, lawfully and transparently
  2. Collected and processed for specific reasons and stored for specific periods of time, and that it is not used for reasons beyond its original purpose
  3. Collected at a granularity needed for the intended purpose, and not more
  4. Accurate and that reasonable steps are taken to ensure that data remain accurate
  5. Kept in a form that allows individuals to be identified only as long as is necessary
  6. Kept securely and protected from unlawful access, accidental loss or damage

34 of 45

Data Rights within GDPR

1. The right to be informed

2. The right of access

3. The right to rectification

4. The right to erasure (i.e. “the right to be forgotten”)

5. The right to restrict processing

6. The right to data portability

7. The right to object

8. Rights in relation to automated decision making and profiling

35 of 45

3. Recommending and Nudging

36 of 45

Recommendation Systems: Benefits & Possibilities

  1. Allows people to navigate and increasingly voluminous information space (in many different domains)
  2. Doing this well has clear benefits for providers and users.
    1. Learning something you didn’t already know
    2. Matching buyers and sellers more effectively
    3. Saving time
    4. Managing information overload

36

37 of 45

Recommendation Systems: Harms & Risks

  1. Recommending can amplify viral / sensational content (e.g. fake news, conspiracy theories, etc.)
  2. Fairness issues (e.g. different benefits / harms to different user groups)
  3. Silos & echo chambers
  4. Persuasive systems (who has the power to persuade and to what ends)?
  5. Perverse incentives to gather more and finer-grained data about individual preferences and behaviors

37

38 of 45

Example: Practice Fusion Lawsuit

The San Francisco-based company, Practice Fusion, had developed a free software program that combed through data that doctors entered and suggested next steps for a treatment plan — known as CDS (clinical decision support) .

Practice Fusion & Pharma Co. X signed a $1 million contract with the understanding that Practice Fusion would help increase sales by targeting people who had never been prescribed opioids before (analytics, big data).

39 of 45

Example: Practice Fusion Lawsuit

..the pain CDS allowed Pharma Co. X to, in essence, be present in the exam room while they interacted with patients.

The rigged alerts popped up on doctors’ computers more than 230 million times between July 2016 and the spring of 2019, when criminal charges were filed. The health-care providers who received them prescribed extended-release opioids at a higher rate than those who didn’t, prosecutors say.

40 of 45

Example: Practice Fusion Lawsuit

Christina E. Nolan, the U.S. attorney for the District of Vermont, said in a Monday statement that Practice Fusion ...allowed the drug company to have its thumb on the scale at precisely the moment a doctor was making incredibly intimate, personal, and important decisions about a patient’s medical care, including the need for pain medication and prescription amounts,” she said.

She also noted that the company said advertising to doctors was its main source of revenue.

41 of 45

Example: Practice Fusion Lawsuit

Read more here...

  1. The Justice Department
  2. Washington Post

42 of 45

4. Dark Patterns

43 of 45

Dark Patterns

Dark Patterns are tricks used in websites and apps that make you do things that you didn't mean to, like buying or signing up for something.

  1. “Roach motels”
  2. Hidden Costs
  3. “Sneak into Basket”
  4. Disguised ads
  5. And more!

More info:

  1. https://www.darkpatterns.org/types-of-dark-pattern
  2. Campaign donation websites

44 of 45

Closing Thoughts

Distributed systems are powerful! As you’re building your systems, it’s up to you to think about the implications of your code / designs. Some questions you might ask yourself:

  1. Would I want to use this tool myself? Would I want my data being used like this?
  2. Is it possible that my designs / software might be harming others?
  3. What are the more nefarious uses of this technology? What are the risks?
  4. What steps might I take to maximize benefit and minimize harm?
  5. Who are the most vulnerable users in this system? Who stands to gain the most?

45 of 45

Thank you!

Have a wonderful winter break.