Windows has many graphics APIs. Some are layers built on top of others (e.g., GDI+ on top of GDI), and others are completely independent stacks (like the Direct3D family).
In an API like GDI, there are functions like SetPixel which let you change the value of a single pixel on the screen (or within a region of the screen that you have access to). But using SetPixel to setting lots of pixels is generally slow.
If you were to build a photorealistic renderer, like a ray tracer, then you'd probably build up a bitmap in memory (pixel by pixel), and use an API like BitBlt that sends the entire bitmap to the screen at once. This is much faster.
But it still may not be fast enough for rendering something like video. Moving all that data from system memory to the video card memory takes time. For video, it's common to use a graphics stack that's closer to the low-level graphics drivers and hardware. If the graphics card can do the video decompression directly, then sending the compressed video stream to the card will be much more efficient than sending the decompressed data from system memory to the video card--and that's often the limiting factor.
But conceptually, it's the same thing: you're manipulating a bitmap (or texture or surface or raster or ...), but that bitmap lives in graphics memory, and you're issuing commands to the GPU to set the pixels the way you want, and then to display that bitmap at some portion of the screen (often with some sort of transformation).
Modern graphics processors actually run little programs--called shaders--that can (among other things) do calculations to determine the pixel values. The GPUs are optimized to do these types of calculations and can do many of them in parallel. But ultimately, it boils down to getting the pixel values into some sort of bitmap in video memory.