MyPC8MyBrain
Member-
Posts
641 -
Joined
-
Last visited
Content Type
Profiles
Forums
Events
Everything posted by MyPC8MyBrain
-
For additional context, this quote did not come out of a single email or a casual exchange. This was after two full days of back and forth where I refused to accept a 100 percent overnight RAM increase on an active enterprise order. I escalated the issue beyond account management to regional and global executives and made it clear I was not going to pay what amounted to ransomware pricing. After that escalation, I pulled the entire invoice off the table. Several hundred thousand dollars in equipment was withdrawn from the deal. that's when I receive the email response I am quoting below from Dell’s regional sales manager. At that point the position was clear. This was not a negotiation and not a supply issue. It was a decision. Dell chose to hold pricing rather than keep the business, and as a result they lost the entire invoice to a competing vendor. That is why the explanation does not hold up. Dell did not suddenly incur double costs overnight. There was no emergency restock at inflated pricing. This was about margin protection and inventory allocation, not cost recovery. When vendors are willing to walk away from large enterprise orders rather than adjust pricing, it tells you where the incentives really are. Traditional customers are no longer the priority when the same hardware can be redirected into higher-margin channels tied to AI demand. That is not a RAM shortage. That is price steering. And this is exactly why buyers need to push back. Once customers accept this as normal, it stops being a market distortion and becomes policy.
-
The AI hardware market isn’t about raw compute anymore. What’s really happening is that it’s being exploited because of a memory problem that’s been baked into x86 PCs and servers for decades and never fixed. Here’s the core of it. CPU and GPU memory don’t talk to each other properly. They sit in completely separate pools, and any time the GPU needs data from system RAM, it has to copy it over the PCIe bus, sync it, process it, then copy it back. This isn’t really a bandwidth problem. It’s latency, coordination, and a mess of software complexity, and it gets worse as AI models get bigger. That’s why VRAM has become a hard gate for anyone trying to run AI locally. If your model doesn’t fit inside the GPU’s memory, you are basically dead in the water, even while massive chunks of system RAM sit idle. Apple Silicon shows this isn’t a law of physics. With unified, coherent memory, CPU and GPU work on the same data without copies. Memory isn’t owned by one device. It is shared. Software gets simpler, latency drops, and efficiency jumps. The only limit is how much memory the product ships with, not the architecture itself. This is exactly where the industry should be heading and where CXL comes in. CXL isn’t a faster PCIe bus. It is a coherency protocol that lets CPUs, GPUs, and memory expanders share the same memory pool. Instead of forcing GPUs to hoard memory locally, you can treat system RAM as a shared resource. Models don’t need to be duplicated per GPU anymore, and scaling becomes a matter of adding compute, not copying memory. It doesn’t magically make DDR as fast as HBM, and latency doesn’t disappear. But it removes the need to constantly move data around just to make accelerators work at all. This is also why CUDA exists. CUDA thrives because it gives developers control over isolated memory domains, which is exactly what current hardware forces you to do. CUDA didn’t create the problem. It just optimized around it. But the moment memory becomes coherent and shared, a lot of CUDA’s “must-have” control starts to matter less. You move from orchestrating memory to scheduling compute, and suddenly the advantage of isolated memory shrinks. NVIDIA knows this, and their strategy locks the status quo in place. They cannot offer unified system memory on x86 today, so instead they pack GPUs with ever-larger VRAM. That is not luxury. It is a workaround. NVLink exists for multi-GPU coherence, but it is mostly limited to servers. High-end workstations ship massive VRAM but no real way to pool it coherently. You are forced to manage memory manually or pay an enormous premium for server-grade gear. It is not a mistake. It is a product boundary. The result is clear. The AI boom is built on duplicated memory, forced copies, and inflated costs. NVIDIA isn’t just winning because of fast GPUs or CUDA. They have built an ecosystem around an architectural flaw. CXL is the first step to fixing it. It will not flip the market overnight, but coherent memory as a first-class system resource will eventually shift the balance. Control over capacity matters less, efficiency of compute matters more. Right now, the industry is paying a premium for an architectural flaw that NVIDIA has learned to monetize with surgical precision. That is the real NVIDIA tax.
-
The mess we’re seeing in RAM pricing didn’t come out of nowhere. It’s the end result of a chain reaction that started with two companies most people have never heard of. The Rabbit Hole begins in Spruce Pine, North Carolina, where Sibelco and The Quartz Corp. run the mines that produce almost all of the ultra-pure quartz used to make the crucibles for growing silicon ingots. This material is so clean and rare that the entire semiconductor supply chain depends on it. No quartz means no crucibles. No crucibles means no wafers. No wafers means no chips. It is one of the most fragile choke points in modern technology, and nobody paid attention to it until Hurricane Helene hit in September 2024. Power went out, roads were flooded, and both mines were shut down for weeks. That alone was enough to push every major wafer supplier into allocation almost immediately. Shin-Etsu, SUMCO, GlobalWafers, Siltronic; all of them tightened supply heading into 2025 because their raw material pipeline had stalled. Once the upstream pressure hit the wafer producers, the downstream consequences landed in the laps of Samsung, SK Hynix, and Micron. These three control roughly ninety-five percent of the world’s DRAM output and effectively the entire supply of HBM. Their order books were already stressed, and the gap between normal PC demand and AI demand turned into a canyon. Consumer DDR5 grows at a predictable pace. AI customers are throwing money at HBM3 and HBM4 and are willing to pay five to ten times the margin of desktop memory. Faced with that imbalance, the big three did what they always do when margins are skewed. They shifted most of their advanced DRAM capacity toward HBM and server grade DDR5. The consumer market was left with scraps. Prices didn’t rise by accident; the shortage was engineered by capacity decisions. December contract pricing jumped nearly 80 to 100 percent in a single month. Retail followed just as fast. Kits that cost around one hundred dollars in midsummer now sit closer to two hundred fifty and climbing. Even DDR4 is riding the same wave as older lines get repurposed or shut down. The ripple effect is already hitting system builders. Prebuilt vendors have announced fifteen to twenty-five percent price increases and directly named memory costs as the reason. And this isn’t the ceiling. Current projections show constrained supply stretching well into late 2027. New fabs take years to build and qualify. Meanwhile AI demand refuses to slow. Boiled down, the Spruce Pine shutdown was the trigger, but the runaway market we’re seeing now is the result of Samsung, SK Hynix, and Micron chasing AI margins and letting the consumer channel absorb the damage. It mirrors the seventies oil shock, except the “OPEC” here is three semiconductor giants who don’t need to hide their strategy. RAM has become digital oil, and the price at the pump just doubled. If you plan to upgrade, do it now, because this trend isn’t turning around anytime soon.
-
That’s exactly the direction this is heading, and it’s not even incompetence it’s intentional. Everyone thinks “8GB is too little,” but for the next wave of OS designs, 8GB is plenty if the UI is nothing more than a thin shell that boots straight into your online account. Local compute becomes irrelevant when everything you do gets pushed through a remote service. First it was games. Then it was telemetry. Then “cloud integration.” Now the hardware footprint itself is being shrunk to force people deeper into online ecosystems where every click, scroll, and purchase is monetized. You’re not buying a machine anymore you’re buying an access terminal that feeds them data. And that’s the endgame reduce BOM costs, cut local capability, and make sure every user especially the non-technical ones has no choice but to live inside a walled garden. It’s the most reliable revenue model they’ve ever had, and they’re going to push it hard. The cheap laptops won’t just be weak. They’ll be weak on purpose. It’s insane when you step back and look at it. We spent decades moving forward empowering people with real machines, real autonomy, real capability and now the entire industry is dragging us backwards so the richest companies on earth can squeeze profit out of every possible angle. People forgot what “PC” even stands for. It was Personal Computing. Local. Independent. Yours. What they’re pushing now is basically Public Computing thin clients dressed up as laptops, everything routed through someone else’s servers, someone else’s rules, someone else’s monetization engine. It’s the opposite of progress. It’s not innovation; it’s consolidation disguised as convenience. And the saddest part? Most people won’t realize what they lost until it’s completely gone and irreversible.
-
Quick update on where things stand and honestly, the market’s taken another turn for the worse in just the past few days. It’s not just RAM anymore. Everything upstream is getting hammered. Even copper futures spiked hard, which tells you all you need to know about where the manufacturing and logistics chain is heading. When raw materials start jumping like that, everything that depends on them follows: PCBs, power delivery, cabling, server chassis, networking hardware, all of it. The pricing pressure we’re seeing from Dell isn’t isolated. It’s a symptom of a bigger storm that’s been building quietly for months. AI build-outs have vacuumed up supply, fabs are oversubscribed, and now even the industrial commodities behind the hardware are taking hits. That’s a perfect recipe for a market where consumers, pro-summer, small businesses, labs, repair shops, everyone getting squeezed hard. And the worst part is the feedback loop; higher costs mean fewer purchases, fewer purchases mean lower volume, lower volume means even higher per-unit costs. This snowball is rolling downhill fast, and unless something breaks the cycle, we’re all about to pay “entry fees” for basic compute that would’ve sounded insane five years ago. the trend is obvious; we’re heading into a hardware market where everything costs more, performs the same, and arrives slower. The timing couldn’t be worse.
-
There’s so much wrong here it’s hard to know where to start. By the time these companies finish polishing their new AI data centers and building their next wave of fabs, they’re going to run headfirst into a market where fewer and fewer people can afford the hardware needed to actually use any of it. This isn’t just about RAM. Every step in the chain is flashing red-> • Rack space for small and mid-size businesses is already tightening, and once data centers give priority to AI customers with deeper pockets, prices will spike. A ton of small operations, hobby projects, and long-standing community forums simply won’t survive the new baseline. • Cloud dependence will get worse, not better. As physical hardware becomes too expensive for individuals to buy or host, more people will be forced into cloud ecosystems that are also raising prices. • Talent shortage is real. There aren’t enough qualified engineers to justify the billions being poured into expansion. You can’t brute-force experience. You can’t scale people like you scale GPUs. • Consumer hardware is drifting toward luxury-tier pricing. It’s becoming something you “qualify for,” not something you buy. Entry-level devices will become disposable junk, while anything usable gets locked behind a financial wall. Everyone’s chasing the same fantasy, infinite AI growth with no real-world constraints. It’s a blind stampede that destroys the equilibrium we finally reached with component pricing; not because of innovation, but because greed and panic are steering the ship. Honestly, it’s hard to call this progress. We’re supposedly the “smart species,” yet we’re building the lion’s den and then sprinting straight toward its mouth. The monkey we claim to have evolved from would’ve simply run for its life. So right now, who looks smarter?
-
Quick update on where Dell’s head is at right now. They’ve fully committed to this narrative that “the market is going up next month” due to AI demand, component shortages, the usual buzzwords. They’re trying to set the stage so everyone just accepts higher pricing as inevitable. The problem is the numbers don’t line up with reality. HP’s public pricing on identical class hardware is still sitting well below Dell’s, even without promos. That tells you everything. On Dell’s side, the internal machinery feels jammed. Slow movement, slow approvals, half-baked answers, lots of “checking with the team.” It’s not the reps; they’re doing what they can, but the system above them is clearly locked into a defensive position. The whole thing feels more like price conditioning than actual market pressure. Bottom line Dell is betting hard that customers won’t push back and won’t price-check outside their ecosystem. The public market says otherwise. If this is the direction they’re taking, expect the next quarter to be rough for anyone sticking with them out of habit instead of value.
-
You’re mixing up three different things 1. Edition stability (Enterprise vs Pro) 2. Servicing model (LTSC vs yearly/H2 releases) 3. Update channels (GA vs Insider builds) These behave differently and shouldn’t be compared as if they were the same. Enterprise > Pro because of stability, update discipline, and no consumer experiments. LTSC > Enterprise because it freezes features and gives you the most stable and predictable platform Microsoft makes. You are not missing essential performance fixes on LTSC. The only things you “miss” are UI changes and consumer features. If you want maximum consistency, low background churn, and a workstation OS that never changes under you, LTSC is the correct branch.
-
a direct comparison against an equivalent HPE configuration to sanity-check this RAM increase. Both servers are effectively identical where it matters: • Xeon Gold 6526Y • 128 GB DDR5-5600 • 2× 480 GB SATA SSD (RI) • Hardware RAID • 8× 2.5" SAS/SATA-capable bays • Redundant Titanium PSUs • The only functional difference is networking (HPE ships 4×1 GbE vs Dell’s 2×1 GbE) Publicly advertised pricing: • Dell PowerEdge R660xs: 8,939.25 link • HPE ProLiant DL360 Gen11: $6,120.28 link (HPE add 2x 480GB SSD Drives to match) Dell’s systems now land almost 50% higher for the same hardware profile.
-
Quick update on the situation, After pushing back on the sudden RAM price hike, Dell doubled down and framed it as “new pricing models” driven by market conditions. I went through several rounds of back-and-forth with their sales and finance contacts. The justification kept shifting, and it became clear the change wasn’t tied to actual upstream cost increases; it was a selective adjustment on server-grade RAM only. The ironic part is we have multiple servers and a large batch of workstations on order. Not a single workstation configuration saw a RAM increase. Only the servers got hit, and only after they realized the size of the overall purchase. That tells you everything. Right now, it looks less like a supply problem and more like internal margin tuning triggered by the current AI-driven panic in the market.
-
We’ve had our share of global “races” before - atomic, space, take your pick. The difference is those had a finish line. AI doesn’t. This one just keeps accelerating with no defined endpoint, and the collateral damage gets pushed onto everyone else. No one actually seems to know what the end goal is here. There’s no clear technological milestone that advances humanity; just financial greed and FOMO. These companies aren’t betting on a defined outcome; they’re betting on “not missing out,” and all of it revolves around endlessly training bigger models with no finish line. It’s a blind arms race that burns resources and drives prices up for everyone else. The newer crowd pushed the old guard aside, and the whole thing is being driven by this generation’s obsession with FOMO. It’s not discipline, strategy, or long-term vision; it’s reactionary hype, and the industry is paying the price for it.
-
Yeah, I saw that headline too. This AI frenzy is out of control. By the time the average non-tech person realizes what’s happening, the damage will already be baked in. Prices never snap back 50% in a few months; once they ratchet up, they stay there. If this keeps going unchecked, it’s going to wreck the entire market. Everyone pays for the hype except the companies driving it.
-
The funny part is we’ve got multiple servers on order and ten times that number in Dell workstations coming in. Not a single workstation config had a RAM price hike. Only the servers did. So the “global supply increase” explanation doesn’t hold water. If this were a universal memory cost jump, it would hit every line across the board. Instead, they selectively bumped server RAM; where margins are already padded and customers are easier to corner. That tells me this isn’t supply pressure; it’s opportunistic pricing.
-
That isn’t “market adjustment.” It’s straight exploitation. Vendors are offloading their AI-driven supply issues onto end users. They’re the ones pouring billions into AI buildouts and creating this demand spike; now they’re trying to make consumers eat the cost. And let’s be honest; Dell’s RAM was already inflated at roughly triple the real value. Adding another $400 overnight for the same 16 GB stick is outrageous. They’re not buying new stock at panic-pricing today; they’re sitting on inventory. This move is nothing but opportunistic markup, and it’s aggravating to watch.
-
This is from our Dell business sales manager letting me know that they now need to increase the price for a single 16GB stick of RAM included in our server config by additional $400 for that single RAM stick that already was quoted at around $360+. looks like manufactures and vendors wants us to pay inflated rates created by the current AI-driven market panic they themselves created.
-
@win32asmguy That lines up with what been observed across multiple Arrow Lake systems, the platform is touchy with PCIe power-state transitions. Disabling PCIe Power State Management might mask the symptom by preventing the low-power handoff that triggers a retrain, but I don’t think it explains the pattern here. The key detail is this, most units with the same BIOS build are not throwing WHEA17 across multiple endpoints during POST. If the BIOS setting alone were the root cause, the failure rate would be universal, not limited to a subset of machines. That leaves two scenarios BIOS/EC firmware bug that only manifests on hardware that’s already marginal (signal integrity, lane quality, power-state timing). or pure hardware variance on the board, where the firmware is exposing a weakness rather than causing it. In both cases, the common thread is that healthy units don’t log corrected link-training retries across the entire bus. That’s why I lean toward a hardware-level instability that the BIOS setting can temporarily hide, not cure. It’s still worth a try, your advice is logical, but the failure pattern doesn’t look like a pure firmware toggle issue. It behaves like hardware that only just meets the edge of the timing window and loses the race during POST.
-
Did you test with the dGPU fully disabled or removed from the bus? you can run a clean test by cutting onboard dGPU power and rebooting. If the WHEA17 warnings stop, it confirms the GPU endpoint is not failing in isolation, and the problem sits upstream in PCIe lane training or Embedded Controller/BIOS power sequencing. A single WHEA17 limited to one endpoint could be a corrected during boot, but once it repeats across several PCIe endpoints or appears after a GPU swap, it stops being a device issue and becomes a platform bring-up and lane training problem. The 2025 Dell HX + Blackwell workstation stack brings up PCIe lanes under very tight signal integrity and early power gating controlled by BIOS and EC. If those layers mis-train or retain dGPU high-power states after POST, every endpoint sharing that root complex will log a corrected AER event (WHEA17), even when Windows feels stable afterward. If you’re not seeing freezes, headless crashes, or ACPI 13/15 spam each minute, then you’re sitting on a bus that ultimately settles after retries. That’s why it looks fine until a power transition hits again. Traditionally, this is a motherboard or CPU PCIe root complex issue handled best by replacement, not module swaps. If you ever start seeing black screen flickers or runtime WHEA17s increasing by the minute, you’ll know the platform didn’t settle cleanly. For now, it’s stable only because it corrected itself on each boot, but a clean POST should not require correction at all.
-
Here’s the reality. When every major device on the PCIe bus logs WHEA-Logger Event 17 at boot, it doesn’t mean the drives, Wi-Fi, or GPU are bad. It means the system had trouble training PCIe lanes at POST and had to retry the handshake before it stabilized. That’s a firmware or board-level issue sitting above all endpoints. The spontaneous ACPI 13 and 15 events after boot point to the embedded controller and BIOS mishandling power states, and the 50-70W idle draw shows the laptop isn’t exiting early-boot high-power mode cleanly. The final symptom screen going black after an Advanced Optimus or MUX handoff without TDR logs means the panel lost its display engine at the firmware level, not in Windows. You bought a 2025 mobile workstation. A healthy system should bring up PCIe cleanly on the first try and never drop the panel without leaving a proper error trail. This isn’t about drivers or storage population. This is a foundation problem, and the correct path is replacement, not shipping your only unit away for over a week. Shipping your only unit away for 5-12 days is unacceptable for a workstation role. Replacement without Collect & Return is possible, but you must frame it as a workflow disruption and hardware foundation fault, not a driver issue. Dell escalation teams (L3/L3.5/L4) normally ship a replacement first when it’s framed as “fault in platform bring-up and panel power state corruption.” Don’t let support push you into a collect and return cycle unless they commit to a full swap. Stay polite, stay consistent, and make the warranty work for you, that’s the only leverage we still have to ensure vendors deliver stable hardware. It's your money, Time, and Data they are putting on the line with their hit or miss QC strategy, They should be grateful you are giving them another try instead of asking for a full refund.
-
The devices throwing those boot warnings are PCIe endpoints. The IDs map to the Intel Wi-Fi 7 module, the Nvidia Blackwell GPU core, and the Nvidia audio controller that runs through PCIe during early boot. The warning flood means the system is retrying PCIe lane initialization and link training at POST, then recovering before Windows fully boots. Now that you also see WHEA 17 during normal use and ACPI 13/15 events with 50-70W idle draw, it’s pointing at unstable lane bring-up and EC/BIOS power-state handling, not individual devices failing. This is platform power and PCIe link instability showing up when endpoints attempt memory access over lanes that never fully stabilize after EC power gating or GPU switching. If it still repeats with one SSD and a fixed GPU path, the next stop is a warranty case for the system as a whole. Don’t hesitate to request an exchange. You paid for a premium workstation, so treat the warranty as part of what you purchased. Dell ships replacements while you keep the current unit, which gives you leverage. If the next one isn’t solid, repeat the process calmly until they deliver a platform that boots cleanly and maintains stable PCIe links and power states. Years ago this is exactly how ThinkPads and Latitudes were handled swap until it works. Today’s Dell QC is hit-or-miss, so a systematic exchange cycle is often the fastest path to a clean, validated root complex and EC state. I’ve replaced Dell purchases multiple times before landing on a healthy unit. Once you get a stable chassis, migrate your data, hand back the old one, and move forward. Keep your current laptop until you confirm the incoming system is the one worth imaging. If it takes several rounds, so be it. That’s how hardware quality has traditionally been forced out of vendors, and it still works if you stay polite and consistent.
-
If Advanced Optimus off fixed the black screen, that already tells you the issue sits in the switching pipeline, not the touchpad event. The WHEA-Logger Event 17 flood on multiple PCIe endpoints during boot is not individual devices failing. It means the system is struggling with PCIe lane initialization at POST, retrying link training before it settles. Since the warnings were there even before adding Samsung 990 Pro or any drive changes, and persist after disabling ASPM, the remaining root causes are The CPU PCIe controller (root complex), or Dell’s BIOS/EC layer mis-training lanes at boot, or Board-level power/clock signaling noise affecting PCIe during early boot A quick way to confirm direction Remove all secondary PCIe storage Turn Advanced Optimus off in the BIOS Boot using only the main SSD Check if WHEA 17 warnings reduce or stop If the warnings still appear in the same volume, it points to a hardware or BIOS-level defect, most likely the mainboard or CPU PCIe path. No driver update is required for this kind of instability to surface—it can start after runtime state drift, stress, or uptime even when nothing visibly changed. If stability matters more than battery life, running with a fixed MUX path or AO off is the reliable choice. But long term, a workstation laptop should not produce PCIe training retries at every boot. If it continues even with minimal population, it’s a warranty case for the system.
-
Even when ASPM is disabled, the vendor drivers still run their own power polling and interrupt management under the OS radar. You can’t fix it by searching logs alone you fix it by trimming the excess until only essentials remain. Most likely suspects Realtek/Intel audio driver DPC spikes (shows up when menus open or apps launch because system sounds try to initialize) iGPU dGPU handoff stalls (Optimus flapping) Not a “crash”, just unstable driver behavior that Windows doesn’t classify as failure. USB controller power polling at interrupt level Shows up when context menu, context click, or window maximize triggers HID calls. None of these will scream error in logs. They produce invisible queue stalls. try disabling every Dell add-on service you don’t actively use. Not drivers that operate hardware, services that pretend to optimize things for you. Turn off system sounds temporarily to test if audio driver latency is involved Control Panel > Sound > Sound Scheme > No Sounds If the freezes vanish or reduce, that points straight at the audio stack.
-
Good, that confirms ASPM is off on AC, so the PCIe power state isn’t the smoking gun. you could have run powercfg /qh SCHEME_CURRENT SUB_PCIEXPRESS ASPM If that also errors, powercfg -query SCHEME_CURRENT SUB_PCIEXPRESS ASPM What you want to see in the output is 0x0 or Off = ASPM is disabled for good measure apply suggested reg entries and reboot (system already reports ASPM is off anyway), and confirm status still persist.