Welcome!

Welcome to the home page of Charles N Wyble. Charles is a 24 year old systems guy, hacker and entrepreneur currently living in El Monte CA, with his wife of 3 years.

He is currently employed as a system engineer for Ripple TV with responsibility for a nation wide advertising network.

In his spare time he serves as Chief Technology Officer for the SoCalWiFI.net project, runs a hacker space in the San Gabriel Valley and tries to save the local economy.


Wednesday, July 01, 2009

[Fwd: CE200 DSLAM observations]

The OpenSDSL project lives on!


-------- Original Message --------
Subject: CE200 DSLAM observations
Date: Wed, 1 Jul 2009 07:10:45 GMT
From: msokolov@ivan.Harhan.ORG (Michael Sokolov)
Reply-To: opensdsl@ifctfvax.Harhan.ORG
To: opensdsl@ifctfvax.Harhan.ORG

Update on the bring-up of our test DSLAM: I have received the new fan
module from David at Routerspot and both fans are now running fine.
The DSLAM's Ethernet port is now connected to the Harhan shop network
and has an IP address assigned. The configuration save operation worked
fine as well (see below).

The next step is to bring out the DSL output and look at it with our
various hacky tools while configuring different encapsulations and
netmodels on the DSLAM. I want to determine experimentally exactly what
this DSLAM is able to serve out on the subscriber loops when it's being
used by itself without a Redback router on the other end of a DS3 ATM
pipe.

With Rhythms and NorthPoint being gone, it seems that the only remaining
network which operates CM DSLAMs with the luxury of DS3 backhaul and
Redback aggregation routers is MegaPath aka former DSL.net on the
Atlantic coast. (If there are any others, I would like to know who they
are!) All other remaining CM DSLAM operators seem to be small
independents like RRIC. If a network operator has a bunch of DSLAMs
deployed in different COs and has DS3 links connecting them to an even
more central "MegaPOP", they can afford to use each DSLAM as nothing
more than a Layer 2 device and provision each subscriber as a separate
PVC from the "MegaPOP" side. However, if you have a single DSLAM and
you want to serve a local contingent of customers from it, you would
simply want to feed raw upstream bandwidth to it and have the DSLAM do
the rest, right? Hence my desire to find out exactly what this DSLAM is
able to serve out when one can't use the Cross-Connect netmodel because
there's nothing to cross-connect to.

The CE200 DSLAM puts its subscriber loops out on 50-pin Amphenol
connectors commonly known as "female telcos". Tomorrow I'll be heading
to the local telco gear warehouse to pick up the right cable for that;
I already have an M66 block with Amphenol breakout.

I have also taken a little peek at the internals of this DSLAM. As I
had already observed a while back, it has two internal buses: CompactPCI
and a proprietary line card bus. The Buffer Control Module is basically
the bridge between these two buses. The line card bus interconnects all
line card slots and the two redundant BCM slots. There two cPCI buses,
one on each side of the chassis for the redundant configuration. Each
cPCI complex consists of the System Control Module, WAN modules and the
BCM which links it to the line card bus and manages the redundancy
feature. My CE200 has a single cPCI complex on the left side, the
redundant slots on the right are unpopulated - apparently not too many
of these DSLAMs were shipped with the redundancy feature. (I've been
told they are so reliable that the redundant cPCI complex isn't really
needed.)

The SCM is an off-the-shelf Pentium SBC (single board computer) made by
Ziatech. (CM's documentation mentions there being 3 different versions
of the CE200 SCM and mine appears to match the description of version 1.)
The system's main CPU being an off-the-shelf SBC obviously raises the
next question of hacker's curiosity: what OS is it running? Well, I now
have the answer: it's VxWorks.

One of my past bosses at an embedded software engineering company has
referred to VxWorks as the "Microsoft of the embedded world", but that
was referring mostly to their proprietary licensing model. Those issues
aside, in purely technical terms VxWorks doesn't seem to be all that bad
when one needs a very complex and very specialized embedded system with
functionality markedly different from general purpose computer OSes.
But then perhaps I make a poor judge in this matter as I'm not very
familiar with VxWorks beyond hearsay.

There are two DE9 serial ports on the SCM, labeled "craft" and
"diagnostic". (According to CM's docs on SCM version 3 they've been
replaced with RJ45s using Cisco pinout, but I've never seen one with my
own eyes.) CM's docs talk only about the craft port and tell you to
leave the other one alone. Well, connecting to the diagnostic port like
CM tells you not to, one sees that it's the console port for VxWorks
running on the CPU. One can interact with the boot ROM, tell it to boot
a different image, and once it's booted, interact with the VxWorks
kernel and see the debug output from various processes. The craft port
in contrast is fully sanitized, all you see there is exactly what you
see when you telnet to a running DSLAM, no low-level debug output of any
kind. Basically the craft port is a way to get to the management CLI
when the IP stack isn't up and usable.

There is one part of this DSLAM design that is likely suffering from a
blemish though, and that is the flash file system. Flash memory on the
SCM is used to store the software/firmware images for all the boards in
the system as well as the config.txt or config.tgz (DOS-ified config.txt.gz,
see below) file which contains all persistent configuration. My contact
at MegaPath has told me that just about the only failure mode seen on
these DSLAMs is that after a certain while the SCM loses the ability to
save its running configuration to persist across power cycles, and after
taking a peek at how its internal file system is implemented, I think I
have a guess as to why that happens.

The two file systems supported by VxWorks are DOS and RT-11. Well,
there is nothing fundamentally wrong with using either of these file
systems in an embedded system (I particularly like the RT-11 fs), except
one little problem: both file systems are designed for regular block
devices, not for flash.

The file system used by CM on the SCM is DOS, or at least it has all the
appearance of a DOS file system: drive letters, 8.3 filenames, backslash
as the path separator, DOS-looking directory listings, all filenames
converted to uppercase as the canonical form. SCM version 1 (the one I
have) has only one kind of flash on it (VxWorks drive letter P:) and it
appears to be NOR flash. According to CM's docs SCM version 2 adds an
IDE flash drive under letter Q:.

Well, as someone who does embedded system consulting for a living
outside of the telecom hobby, I can tell you that NOR flash is rather
delicate and that it's designed for directly executable code, not for
making a file system out of it. Implementing a reliable flash file
system on top of NOR flash is certainly possible (jffs2 in Linux is a
good example), but it takes certain care, and I have never heard of
anyone implementing a DOS file system on top of NOR flash.

Being designed for traditional disk devices with 512-byte sectors, the
DOS file system is totally unsuited for NOR flash. The only way to
implement such a thing would be to build a software emulation layer that
makes a virtual block device out of NOR flash, and that is tricky. If
one does the dumb thing of implementing writes of 512-byte "sectors" by
taking the much larger flash sector, stashing it away in RAM, erasing it
and reprogramming it in flash, the resulting system will not only wear
the flash out very very shortly, but will also be vulnerable to
catastrophic fs corruption should the power go out in the middle of that
operation. I sure hope that CM's implementation of their DOS fs isn't
*that* dumb, but given all those field reports with the symptoms of
flash wearing out prematurely, who knows... Even if it isn't super-dumb,
it sure seems very suboptimal. There is the FTL standard that tells you
how to do it "right", but I'm not familiar with the details off the top
of my head and I still don't know how good it is - I would choose jffs2
over FTL any day in any system I design.

I have a guess that the reason why CM had added an IDE flash drive to
version 2 of their SCM was not because they needed more room, but simply
to avoid the configuration save flash wear problem. (Apparently when
that IDE flash drive is present, only the SCM code image stays on P: and
all other images and the config file move to Q:.) But they could have
fixed it by improving their software instead of throwing in more
unnecessary hardware!

Well, it's getting late here and I'm heading to bed. I will post more
when I wire up the DSL output from the beastie and play with some
configurations.

MS

No comments: