Jump to content

Memory timings

From Wikipedia, the free encyclopedia
(Redirected from Ram latency)

Memory timings or RAM timings describe the timing information of a memory module or the onboard LPDDRx. Due to the inherent qualities of VLSI and microelectronics, memory chips require time to fully execute commands. Executing commands too quickly will result in data corruption and results in system instability. With appropriate time between commands, memory modules/chips can be given the opportunity to fully switch transistors, charge capacitors and correctly signal back information to the memory controller. Because system performance depends on how fast memory can be used, this timing directly affects the performance of the system.

The timing of modern synchronous dynamic random-access memory (SDRAM) is commonly indicated using four parameters: CL, TRCD, TRP, and TRAS in units of clock cycles; they are commonly written as four numbers separated with hyphens, e.g. 7-8-8-24. The fourth (tRAS) is often omitted, or a fifth, the Command rate, sometimes added (normally 2T or 1T, also written 2N, 1N or CR2). These parameters (as part of a larger whole) specify the clock latency of certain specific commands issued to a random access memory. Lower numbers imply a shorter wait between commands (as determined in clock cycles). The Intel systems also have Gear 2 (Gear type 0) and Gear 4 (Gear type 1).

Name Symbol Definition
CAS latency CL The number of cycles between sending a column address to the memory and the beginning of the data in response. This is the number of cycles it takes to read the first bit of memory from a DRAM with the correct row already open. Unlike the other numbers, this is not a minimum, but an exact number that must be agreed on between the memory controller and the memory.
Row Address to Column Address Delay TRCD The minimum number of clock cycles required between opening a row of memory and accessing columns within it. The time to read the first bit of memory from a DRAM without an active row is TRCD + CL.
Row Precharge Time TRP The minimum number of clock cycles required between issuing the precharge command and opening the next row. The time to read the first bit of memory from a DRAM with the wrong row open is TRP + TRCD + CL.
Row Active Time TRAS The minimum number of clock cycles required between a row active command and issuing the precharge command. This is the time needed to internally refresh the row, and overlaps with TRCD. In SDRAM modules, it is simply TRCD + CL. Otherwise, approximately equal to TRCD + 2×CL.
Notes:
  • RAS : Row Address Strobe, a terminology holdover from asynchronous DRAM.
  • CAS : Column Address Strobe, a terminology holdover from asynchronous DRAM.
  • TWR : Write Recovery Time, the time that must elapse between the last write command to a row and precharging it. Generally, TRAS = TRCD + TWR.
  • TRC : Row Cycle Time. TRC = TRAS + TRP.

What determines absolute latency (and thus system performance) is determined by both the timings and the memory clock frequency. When translating memory timings into actual latency, it is important to note that timings are in units of clock cycles, which for double data rate memory is half the speed of the commonly quoted transfer rate. Without knowing the clock frequency it is impossible to state if one set of timings is "faster" than another.

For example, DDR3-2000 memory has a 1000 MHz clock frequency, which yields a 1 ns clock cycle. With this 1 ns clock, a CAS latency of 7 gives an absolute CAS latency of 7 ns. Faster DDR3-2666 memory (with a 1333 MHz clock, or 0.75 ns exactly; the 1333 is rounded) may have a larger CAS latency of 9, but at a clock frequency of 1333 MHz the amount of time to wait 9 clock cycles is only 6.75 ns. It is for this reason that DDR3-2666 CL9 has a smaller absolute CAS latency than DDR3-2000 CL7 memory.

Both for DDR3 and DDR4, the four timings described earlier are not the only relevant timings and give a very short overview of the performance of memory. The full memory timings of a memory module are stored inside of a module's SPD chip. On DDR3 and DDR4 DIMM modules, this chip is a PROM or EEPROM flash memory chip and contains the JEDEC-standardized timing table data format. See the SPD article for the table layout among different versions of DDR and examples of other memory timing information that is present on these chips.

Modern DIMMs include a Serial Presence Detect (SPD) ROM chip that contains recommended memory timings for automatic configuration as well as XMP/EXPO profiles of faster timing information (and higher voltages) to allow for a performance boost via overclocking. The BIOS on a PC may allow the user to manually make timing adjustments in an effort to increase performance (with possible risk of decreased stability) or, in some cases, to increase stability (by using suggested timings).[clarification needed]

DDR5 introduced support for FGR (fine granular refresh), with its own tRFC2 and tRFC4 timings.[1]

Note: Memory bandwidth measures the throughput of memory, and is generally limited by the transfer rate, not latency. By interleaving access to SDRAM's multiple internal banks, it is possible to transfer data continuously at the peak transfer rate. It is possible for increased bandwidth to come at a cost in latency. In particular, each successive generation of DDR memory has higher transfer rates but the absolute latency does not change significantly, and especially when first appearing on the market, the new generation generally has longer latency than the previous one. The architecture and bugs in the CPUs can also change the latency.

Increasing memory bandwidth, even while increasing memory latency, may improve the performance of a computer system with multiple processors and/or multiple execution threads. Higher bandwidth will also boost performance of integrated graphics processors that have no dedicated video memory but use regular RAM as VRAM. Modern x86 processors are heavily optimized with techniques such as superscalar instruction pipelines, out-of-order execution, memory prefetching, memory dependence prediction, and branch prediction to preemptively load memory from RAM (and other caches) to speed up execution even further. With this amount of complexity from performance optimization, it is difficult to state with certainty the effects memory timings may have on performance. Different workloads have different memory access patterns and are affected differently in performance by these memory timings.

Handling in BIOS

[edit]

In Intel systems, memory timings and management are handled by the Memory Reference Code (MRC), a part of the BIOS.[2][better source needed][3] A lot of it is also managed in Intel MEI, Minix OS that runs on a dedicated core in PCH. Some of its subfirmwares can have effect on memory latency.

See also

[edit]

References

[edit]
  1. ^ Stuecheli, Jeffrey (June 2013). "Understanding and Mitigating Refresh Overheads in High-Density DDR4 DRAM Systems" (PDF). doi:10.1145/2485922.248592. Retrieved 2025-01-06.{{cite web}}: CS1 maint: url-status (link)
  2. ^ Posted by Alex Watson, possibly repost from original content on custompc.com [unclear] (2007-11-27). "The life and times of the modern motherboard". p. 8. Archived from the original on 22 July 2012. Retrieved 23 December 2016.
  3. ^ Pelner, Jenny; Pelner, James. "Minimal Intel Architecture Boot Loader (323246)" (PDF). Intel. Retrieved 12 November 2022.