slides.org (11516B)
1 * 1. Concurrency 2 3 - David Wen Riccardi-Zhu 4 - Senior Software Engineer @ Good Uncle 5 - dwrz@dwrz.net 6 7 Slides and code: https://github.com/dwrz/talks/concurrency. 8 9 * 2. Why 10 11 You'll likely encounter it at work: 12 13 - Client requests to your backend 14 - Database and API requests made by your backend 15 - Frontend rendering 16 17 * 3. Why 18 19 Performance Benefits 20 21 - Typically outweigh gains from optimized data structures and algorithms. 22 - I/O versus CPU bound applications. 23 24 - Hardware limitations. 25 - Rate of single core performance improvement is slowing. 26 - More cores. 27 28 * 4. Why 29 30 Expressive 31 32 * 5. Why 33 34 Difficult to do correctly. 35 36 Easy to do incorrectly. 37 38 * 6. Goals 39 40 - Give you a rough sense of what concurrency is. 41 - Provide context to help you understand how it's made possible. 42 - Practical demonstration, to expose some of the trade-offs. 43 44 * 7. Three Parts 45 46 1. Fundamentals 47 2. Application 48 3. Infrastructure 49 50 * 8. Definition 51 52 /Concurrency/ 53 1. A running together in place or time. 54 2. Simultaneous occurrence. 55 56 OK starting point, but: 57 58 Time, and simultaneity, is /complicated/. 59 60 A second is a duration, not an instant. 61 62 Two things can happen in the same second and not be simultaneous. 63 64 With computers, this nuance matters. 65 66 * 9. Time 67 68 |----------------------+----------------+----------------| 69 | System Event | Actual Latency | Scaled Latency | 70 |----------------------+----------------+----------------| 71 | One CPU cycle | 0.4 ns | 1 s | 72 | Level 1 cache access | 0.9 ns | 2 s | 73 | Level 2 cache access | 2.8 ns | 7 s | 74 | Level 3 cache access | 28 ns | 1 min | 75 | Main memory access | ~100 ns | 4 min | 76 | NVMe SSD I/O | ~25 μs | 17 hrs | 77 | SSD I/O | 50–150 μs | 1.5–4 days | 78 | Rotational disk I/O | 1–10 ms | 1–9 months | 79 | Internet: SF to NYC | 65 ms | 5 years | 80 | Internet: SF to HK | 141 ms | 11 years | 81 |----------------------+----------------+----------------| 82 83 Source: [[https://www.prowesscorp.com/computer-latency-at-a-human-scale/]] 84 85 * 10. Example 86 87 - Get the current temperature of ten most populous cities in the US. 88 89 - First, /sequentially./ 90 - Then, /concurrently/. 91 92 [[file:cmd/temperatures/run.sh]] 93 94 * 11. Example 95 96 [[file:images/concurrent-requests.jpg]] 97 98 * 12. Cooking Recipes 99 100 - Step-by-step 101 - Concurrent: 102 - Handle other tasks while: 103 - Preheating oven 104 - Boiling water 105 - Waiting for timers 106 - Switching back to a task when it is ready for mork work. 107 108 * 13. Multitasking 109 110 Computers can "simultaneously" work on: 111 112 - Different /parts/ of a recipe 113 - Different /recipes/ 114 115 * 14. htop 116 117 * 15. How? 118 119 Combination of: 120 - Operating System *Scheduler* 121 - Hardware (/interrupts/) 122 - Keyboard 123 - Filesystem 124 - Network Card 125 - Timer 126 127 * 17. Cooperative Scheduling 128 [[file:images/cooperative-scheduling.jpg]] 129 130 * 18. Preemptive Scheduling 131 [[file:images/preemptive-scheduling.jpg]] 132 133 * 19. Preemptive Scheduling 134 [[file:images/preemptive-scheduling-tree.jpg]] 135 136 * 20. Multi Core 137 [[file:images/preemptive-scheduling-multi-core.jpg]] 138 139 * 21. Concurrency versus Parallelism 140 141 - Parallelism :: performing computations at the same time. 142 - Multiple cores. 143 - Multiples computers. 144 - Kind of like having multiple cooks in a kitchen. 145 - Can do different things at the same time. 146 - Can do one job faster (peeling a lot of potatoes). 147 148 - Concurrency :: performing computations over inter-leaving, non-sequential time periods. 149 - Single or multi-core. 150 151 * 22. Processes and Threads 152 A little more nuance. 153 154 - Different /recipes/ 155 - Different /parts/ of a recipe 156 157 - Process :: a program in execution. 158 - Unique resources (memory, code, files, etc.). 159 - Start off with one thread. 160 - Can /fork/. 161 - Communicate via inter-process communication (IPC). 162 163 - Thread :: sequence of instructions that can be managed independently by scheduler. 164 - Share resources ∴ less resource intensive. 165 - Can communicate via shared resources. 166 167 * 23. Processes and Threads 168 169 [[file:images/process-vs-thread.jpg]] 170 171 Source: https://stackoverflow.com/questions/16354460/forking-vs-threading/16354658 172 173 * 24. htop 174 175 Press ~H~ to show threads. 176 177 * 25. Languages and Runtimes 178 179 - ~C~, ~C++~, ~Rust~ allow for fine-grained control of processes and threads. 180 #+begin_src C :output raw 181 int main() { 182 fork(); 183 184 printf("Hello world!\n"); 185 186 return 0; 187 } 188 #+end_src 189 190 * 26. JavaScript 191 - Single-threaded (for application code). 192 - Runtimes implement concurrency via an event loop. 193 194 [[file:images/nodejs-event-loop.png]] 195 196 Source: [[https://medium.com/preezma/node-js-event-loop-architecture-go-deeper-node-core-c96b4cec7aa4]] 197 198 * 27. Aside: Event Loop 199 200 What is the output of the following? Why? 201 #+begin_src javascript 202 console.log('start'); 203 204 setTimeout(() => console.log('1'), 0); 205 206 console.log('middle'); 207 208 setTimeout(() => console.log('2'), 0); 209 210 console.log('end'); 211 #+end_src 212 213 * 28. Go 214 *Runtime scheduler* manages multiple threads. 215 [[file:images/go-scheduler.png]] 216 217 Source: https://freethreads.net/2019/01/23/go-runtime-scheduler-design-internals/ 218 219 * 29. Quick Review 220 221 - Sequential versus Concurrent Steps 222 - Compare with recipes. 223 - Concurrency versus Parallelism 224 - Parallelism: multiple cooks, simultaneously /doing/. 225 - Concurrency: one or more cooks, /dealing/ with many tasks. 226 - Operating System Scheduler 227 - Processes and Threads 228 - Languages and Runtimes 229 230 * 30. Part 2: Using Concurrency 231 232 Typical use cases: 233 - Network I/O between: 234 - Clients and servers. 235 - Services and database(s). 236 - Microservices. 237 - Service and external APIs. 238 239 - User interfaces 240 - Capturing user input 241 - Rendering output 242 243 - Similar, but independent computations. 244 245 - Solving problems that are best expressed with concurrency. 246 247 * 31. Go Examples 248 249 1. Shared Memory with Mutex (Mutual Exclusion) 250 - Pitfall: synchronization, debugging. 251 252 2. Message Passing (Channels and Goroutines) 253 - Pitfall: resources usage, complexity. 254 255 * 32. Mutex Example 256 Concurrent updates to a bank account balance. 257 258 Code: 259 [[file:cmd/mutex/main.go]] 260 261 Run: [[file:cmd/mutex/run.sh]] 262 - Without mutex. 263 264 * 33. Data Races 265 [[file:images/data-race.jpg]] 266 267 * 34. Mutex 268 269 /Mutual Exclusion/ 270 271 - Purpose: protect a critical section of code. 272 - Effect: serialization. 273 274 Run: [[file:cmd/mutex/run.sh]] 275 - With mutex. 276 277 * 35. Race Detector 278 279 How can we be sure? 280 281 Race detector can give us /some/ confidence (but no guarantees). 282 283 - Without mutex. 284 - With mutex. 285 286 * 36. Tradeoffs: Validating Correctness, Testing, Debugging 287 288 - Non-deterministic behavior 289 - Works fine on one run, fatal bug in another. 290 - Up to how the scheduler(s) order things at execution time. 291 - Concealed bugs 292 293 * 37. Message Passing 294 295 Proverbs: 296 #+begin_quote 297 Don't communicate by sharing memory, share memory by communicating. 298 299 Channels orchestrate, mutexes serialize. 300 #+end_quote 301 302 * 38. Program Requirements 303 304 - Read CSV of zip codes 305 - 50 in our example 306 - Full list is ~42,000 307 308 - Get temperature with OpenWeatherMap API 309 - 60 requests per minute 310 - Need to rate limit our API requests. 311 312 - Display running list of top ten temperatures. 313 314 * 39. Concurrency Hallmarks 315 316 - Repeatedly processing similar data (lines of CSV) 317 - Network I/O 318 - UI I/O 319 320 * 40. Message Passing Example 321 322 Run: [[file:cmd/csp/run.sh]] 323 324 * 41. Code Walkthrough 325 326 [[file:cmd/csp/main.go]] 327 328 * 42. Channels and Goroutines 329 [[file:images/message-passing-example-diagram.jpg]] 330 331 * 43. Channels and Goroutines 332 [[file:images/go-scheduler.jpg]] 333 334 * 44. Scaling 335 336 What if: 337 - We wanted to process the entire list of 42,000 zip codes? 338 - We had access to the Enterprise tier of the Open Weather Map API (200,000 requests per minute). 339 340 Could we just get rid of the limiter? 341 342 * 45. Resources 343 344 Probably not, because: 345 346 - OS limit on open files (~ulimit~) 347 - Ask me how I know... 348 - Server load 349 - Network load 350 - CPU 351 - Ranking goroutine becomes a chokepoint. 352 - Better data structures and algorithms could help here. 353 - Goroutines are cheap, but not free. 354 355 * 46. Possible Solutions 356 357 - Use a semaphore to limit concurrency. 358 - Buffers 359 360 * 47. Designing and Managing Concurrency 361 362 - Size of input matters. 363 - Resources 364 - Pipelines 365 - Chokepoints 366 - Backpressure 367 - Complexity 368 369 * 48. Quick Review 370 371 Techniques for implementing and managing concurrency: 372 - Event loop 373 - Mutexes 374 - Communication (channels and goroutines) 375 376 Concurrency Pitfalls: 377 - Synchronization 378 - Resource usage 379 - Debugging 380 - Complexity 381 - Need to think differently about concurrent code. 382 383 * 49. Part 3: Infrastructure Concurrency 384 385 Concurrent applications running concurrently on distributed computers. 386 387 - Client Application 388 - Backend Application 389 - Databases 390 - Host and Network Infrastructure 391 392 * 50. Example 393 [[file:images/infrastructure-data-race.jpg]] 394 395 * 51. Review 396 397 - What concurrency is 398 - Sequential versus concurrent 399 - Comparison with recipes and cooks 400 - Compared to parallelism 401 - How it is implemented 402 - Operating System Scheduler and Interrupts 403 - Languages and Runtimes 404 405 * 52. Review 406 407 - When, why, and how to use it 408 - I/O + UI 409 - Similar, independent computations 410 - Expressiveness 411 412 - Pitfalls 413 - Synchronization 414 - Resources Usage 415 - Debugging 416 - Complexity 417 418 * 53. Review 419 420 - Infrastructure 421 - Data Race 422 423 * 54. Conclusion 424 425 - Goals: 426 - Give you a rough sense of what concurrency is. 427 - Provide context to help you understand what's going on. 428 - Practical exposure, to expose some of the trade-offs. 429 430 - Awareness and Intuition 431 432 - Remember: 433 - /Premature optimization is the root of all evil/. 434 - Tradeoffs and economics (resources, costs, developer time). 435 - Learning is continuous... 436 437 * 55. Further Reading 438 439 - [[https://slikts.github.io/concurrency-glossary/][Concurrency Glossary]] 440 - [[https://blog.golang.org/waza-talk][Concurrency is not parallelism]] 441 - [[https://en.wikipedia.org/wiki/Executable_and_Linkable_Format][Executable and Linkable Format]] 442 - [[https://www.youtube.com/watch?v=KBZlN0izeiY][GopherCon 2017: Kavya Joshi - Understanding Channels]] 443 - [[http://web.mit.edu/6.005/www/fa15/][MIT 6.005: Software Construction]] 444 - Chapters 19, 20, 22, 23 445 - [[https://www.gopl.io/][The Go Programming Language]] 446 - [[https://en.wikipedia.org/wiki/Concurrency_(computer_science)][Wikipedia: Concurrency]] 447 - [[https://en.wikipedia.org/wiki/Concurrency_control][Wikipedia: Concurrency Control]] 448 - [[https://en.wikipedia.org/wiki/Concurrent_computing][Wikipedia: Concurrent Computing]] 449 - [[https://en.wikipedia.org/wiki/Deadlock][Wikipedia: Deadlock]] 450 - [[https://en.wikipedia.org/wiki/Dining_philosophers_problem][Wikipedia: Dining Philosophers Problem]] 451 - [[https://www.backblaze.com/blog/whats-the-diff-programs-processes-and-threads/][What’s the Diff: Programs, Processes, and Threads]] 452 - [[https://www.youtube.com/watch?v=8aGhZQkoFbQ&t=13s][What the heck is the event loop anyway? | Philip Roberts]] 453 - [[https://en.wikipedia.org/wiki/Heisenbug][Wikipedia: Heisenbug]] 454 - [[https://en.wikipedia.org/wiki/Interrupt][Wikipedia: Interrupt]] 455 - [[https://en.wikipedia.org/wiki/Libuv][Wikipedia: Libuv]] 456 - [[https://en.wikipedia.org/wiki/Process_(computing)][Wikipedia: Process]] 457 - [[https://en.wikipedia.org/wiki/Scheduling_(computing)][Wikipedia: Scheduling]] 458 - [[https://en.wikipedia.org/wiki/Therac-25][Wikipedia: Therac-25]] 459 - [[https://en.wikipedia.org/wiki/Thread_(computing)][Wikipedia: Thread]] 460 - [[https://github.com/danicat/pacgo][pacgo: A Pac Man clone written in Go]] 461 - [[https://medium.com/preezma/node-js-event-loop-architecture-go-deeper-node-core-c96b4cec7aa4][node.js event loop architecture]]