IBM* LAN SERVER 4.0 WHITE PAPER
       PERFORMANCE, CAPACITY ENHANCEMENTS, & TUNING TIPS
 
 
               IBM LAN SYSTEMS PERFORMANCE ANALYSIS
                         DEPARTMENT 55LS
                          AUSTIN, TEXAS
                           MARCH 1995
                           (Revised)
 
Contents
 
       Introduction
 
       LAN Server 4.0 Performance Tuning Assistant
             Introduction to the Tuning Assistant
             HPFS 386 Cache Size Calculation
             Examples of Key Parameter Calculations
             Using Tuning Assistant in "What if" Mode
 
       LAN Server 4.0 Configuration Defaults
 
       DOS LAN Services Client Performance Considerations
 
       LAN Server 4.0 Capacity Enhancements
 
       LAN Server 4.0 Support of SMP
 
       NetBIOS over TCP/IP
             Design Considerations
             Enhancements
             Performance Characteristics
             Tuning TCP/IP
             Recommendation: Dual Protocol Stacks
 
       Additional Useful Information
             Reducing NetBios Broadcast Frames
             DCDB Replication Performance
             Upgrading from LAN Server 3.0
             Considerations when RAW SMBs are disabled
             DOS TCP/IP
             Configuring DLS with Windows for Workgroups
 
       Additional Tips for LAN Server 4.0 Performance
             Entry vs. Advanced Server
             Fixed Disk Utilization
             CPU Utilization
             Network Interface Cards
             Network Media Utilization
 
       Performance Benchmark Comparison
 
 
 
Introduction
 
LAN Server 4.0 includes features that allow increased capacity and
performance over LAN Server 3.0.  Architectural limitations of LS 3.0 have
been addressed in LS 4.0.  Parameter defaults have been increased so that a
newly installed Advanced Server supports 100 users without modifications.
Client-side caching has been added to the DOS  client resulting in improved
performance.  A new protocol driver that runs at ring 0 privilege along with
OS/2* NetBIOS over TCP/IP provides greater performance than the OS/2 NetBIOS
for TCP/IP used by LS 3.0. In addition, LS 4.0 provides a tool (LAN Server
4.0 Tuning Assistant) to help users tune their specific configurations for
optimum performance.
 
The features described above are documented in the LS 4.0 publications.  It
is the intent of this paper to provide additional information, such as
design considerations and performance analysis results, from the LAN Server
Performance Analysis group.
 
 
LAN Server 4.0 Performance Tuning Assistant
 
  Introduction to the Tuning Assistant
 
    The Tuning Assistant was designed to satisfy the following usability and
    performance objectives:
 
    þ      Provide an easy way for users to tune LS 4.0 to their configurations
    þ      Provide performance tuning based on users' unique situation
    þ      Optimize performance parameters but leave safety margin
    þ      Provide a tool to allow "what if" calculations
 
    The following are general rules implemented by the Tuning Assistant during
    calculations and modifications
 
    þ      Never reduce parameters below their default value
    þ      Never add or delete lines from any configuration file
    þ      Never exceed maximums such that system will not boot
    þ      NetBios resource requirements will be spread equally over adapters
    þ      LAN Server has priority if NetBios resources are overcommited
 
    In most environments the important elements in tuning LAN Server for best
    performance (in priority order) are the following:
 
    þ      Configure the largest HPFS386 cache possible
    þ      Provide a sufficient number of NUMREQBUF
    þ      Provide a sufficient number of Commands ('x2' parameter in the
           'netx' line of the IBMLAN.INI)
    þ      Enough adapters (and NETBIOS resources) for number of users
    þ      Provide a sufficient number of NUMBIGBUF(Entry Server or print
           spooling only)
    þ      Reserve sufficient memory for GUI if it will be used frequently and
           delay for swapping is undesirable
 
  HPFS386 Cache Size Calculation
 
    The HPFS386 cache size calculation involves determining all the uses of
    memory in the system and assigning the remainder to the cache.  Listed
    below are the factors used by the Tuning Assistant in the calculation of
    the HPFS386 cache size of a system with a memory size of 32.0 MB.
 
    Memory Allocation (MB):
           - OS/2 base                                   2.8
           - Spooler                                     0.7
           - MPTS base                                   0.6
           - Memory per adapter                          0.2
           - LAN Server base                             3.5
           - IBMLAN.INI additional                       1.5
           - Heap reserve                                1.0
           - Cache mgmt (64 bytes/1K                     1.3
             bytes for memory in excess
             of 12M)
           - Reserve for other apps                      1.0
           - Safety margin (5 percent)                   1.6
           - Memory assigned to HPFS386 cache           17.8
                        TOTAL                           32.0 MB
 
    The entries for OS/2, MPTS and LAN Server base are the same as found in the
    Memory Estimating Worksheets in the Network Administrators References
    Volume 1, Appendix A.
 
    The "IBMLAN.INI additional" entry is for increases made to certain
    parameters by the Tuning Assistant.
 
    The Heap reserve memory is set to 1 MB; this memory is used by LAN Server
    for internal file system control needs such as open file handle tables,
    search handles, filename parsing, etc.  This memory is not assigned to
    the Heap parameter but merely set aside for availability.
 
    The cache management entry is necessary when dealing with large cache sizes
    as it becomes a significant amount of memory.  The formula is applied
    after subtracting 12 MB from the system memory size since at least 12 MB
    of memory is always needed and is never available for cache.
 
    By default, 1 MB is always reserved for other user applications to be run
    on the server.  This is a parameter used by the Tuning Assistant to
    provide the administrator with a significant input to the cache size
    calculation. Note:  The administrator should determine the memory
    requirements of any application that is to run concurrently with LAN
    Server and provide that value in the 'Application Reserve Memory' entry
    field of Tuning Assistant.
 
    An important example of an additional user application on the server is
    the new LS 4.0 Graphical user Interface (GUI).  If the administrator will
    regularly use the GUI administration feature, at least 5 MB should be
    entered for this parameter to provide good performance of the GUI.  If
    this amount of memory is not available, significant swapping will occur
    when the GUI is started.  If only occasional use of the GUI on the server
    is expected, the recommendation is to leave this parameter at 1 MB and use
    the additional 4 MB system memory for HPFS386 cache.
 
  Examples of Key Parameter Calculations
 
  NUMREQBUF (IBMLAN.INI)
 
    Optimum number is 2 to 3 per "active" user.  Since NUMREQBUF locks the
    memory from other processes, we want to be efficient.  Also, since most
    uses of NUMREQBUF require a corresponding command it is wasteful to
    allocate more NUMREQBUF than commands.  Only 250 commands per adapter are
    configured by Tuning Assistant, thus only 250 NUMREQBUFs per adapter will
    be configured.
 
    Calculation:  2.2 times MAXUSERS with maximum of 250 per adapter.
 
    Special Considerations:  Memory used by NUMREQBUF is calculated in Tuning
    Assistant using a hard coded value of 4096 bytes for each request
    buffer. If the user wants to change SIZREQBUF from 4096 to 2048,
    then the calculated HPFS386 cache size can be increased by
    (NUMREQBUF * SIZREQBUF) /2.
 
    A parameter related to NUMREQBUF is the USEALLMEM parameter in the
    Requester section of the IBMLAN.INI.  This parameter allows request
    buffers to be defined in the memory above 16 MB.  If no network interface
    cards (NICs) are limited to 24-bit direct memory access (DMA), and more
    than 16 MB of RAM is installed in the machine, set this parameter to
    'YES'.
 
  NUMBIGBUF (IBMLAN.INI)
 
    NUMBIGBUF are used only by the ring 3 (Entry) Server when files are
    accessed on a FAT or HPFS file system or the Printer Spooler is accessed.
    Because better performance can be obtained by using all available memory
    for HPFS386 cache, NUMBIGBUF will not be increased if the LAN Server
    Advanced package is installed.
 
    Calculation: If Advanced Server, NUMBIGBUF = 12 (default).  If Entry
    Server, NUMBIGBUF increases to a maximum of 80 as MAXUSERS increases.
 
  Commands (IBMLAN.INI)
 
    For optimum performance commands also needs to be 2 to 3 per "active" user
    because of its close relationship with NUMREQBUF.  Obviously, if 250 users
    are logged on through one adapter, each user will not have 2 to 3 commands
    always available and performance will be less than optimum.  The
    IBMLAN.INI parameter that specifies commands is 'x2' in the 'netx'
    statement.
 
    Calculation:  2.2 times MAXUSERS with maximum of 250 per adapter
 
    Special Considerations: Commands (NCBS in Protocol.ini) are NetBios
    resources and must be shared with other NetBios applications like DB2*,
    Lotus Notes**, etc.  If a user specifies a MAXUSERS >=114, commands of
    250 will be set for LAN Servers' net1 line, leaving only 4 commands for
    other NetBios applications. Users should manually reduce commands in the
    net1 line to allow the other applications more NCBS resources if required.
 
  Maxusers (IBMLAN.INI)
 
    Calculation:  # DOS/Windows** users + # OS/2 users + # additional servers
    (if Domain Controller)
 
  Maxshares (IBMLAN.INI)
 
    Calculation: Number home dirs + # aliases + (3 * number shared apps).
 
  Maxconnections (IBMLAN.INI)
 
    Calculation:  (MAXUSERS + # additional servers) * 4
 
    Special considerations: Advanced Server maintains its own set of connection
    resources; this parameter pertains only to resources shared by the ring 3
    (Entry) Server such as print aliases.  This is also true for the following
    parameters which are not changed for Advanced Server (if HPFS only):
    MAXLOCKS, MAXOPENS.
 
  Maxsearches (IBMLAN.INI)
 
    Calculation: The Tuning Assistant sets MAXSEARCHES = 700
 
    Special considerations: Advanced Server maintains its own set of search
    resources; this parameter pertains only to searches done by the ring 3
    (Entry) Server.  This value was chosen to provide ample search memory for
    the ring 3 (Entry) Server.
 
  Sessions (PROTOCOL.INI)
 
    Calculation: DOS/WIN requesters + OS/2 requesters + # additional servers
                 (if DC) + Lotus Notes requesters + DB2 requesters + User
                 logged on server + other NetBios requirements
 
 
  NCBS Calculation (PROTOCOL.INI)
    For optimum performance NCBS also need to be 2 to 3 per "active" LAN Server
    user plus the NetBios commands needed by other NetBios applications.
 
    Calculation: 2.2 times MAXSESSIONS + other NetBios reqmts up to a maximum
                        of 254 per adapter
 
    Special considerations: NCBS in PROTOCOL.INI are shared with other NetBios
    applications like DB2, Lotus Notes, etc. LAN Server will use a maximum of
    250 of the 254.
 
  Using Tuning Assistant in "What if" Mode
 
    In response to feedback from Beta users the "What if" mode was added,
    although it is not described in the product documentation.   This is the
    capability to run Tuning Assistant calculations on a machine other than
    where LAN server 4.0 is installed. This will allow a user to provide system
    configuration information to Tuning Assistant and create tuned
    configuration files for use on other machines. IBM is interested in your
    experience with Tuning Assistant.  Please post your comments and any
    operational concerns to the LS40 CFORUM and someone will respond.
 
    The Tuning Assistant's filename is LS40TUNE.EXE; it is located in the
    IBMLAN directory. The additional parameters which can be used in the
    command line launch of Tuning Assistant are as follows:
 
           /D:DOMAIN1 - Domain name (has no effect on calculations)
           /S:SERVER1 -  Server name (has no effect on calculations)
           /T:DC(or AS) - Type: Domain controller or Additional server
           /P:Entry(or Advanced) Package - Entry or Advanced version
           /M:XX -  System Memory in MB
           /A:N - Number of network interface cards(adapters)
           /U - User supplied files (CONFIG.SYS, IBMLAN.INI, PROTOCOL.INI,
           and HPFS386.INI)
 
           NOTE: /T is always required whenever /U is specified if running on
           a system with no server installed.
 
    Example 1
 
           LS40TUNE /D:DOMAIN1 /S:SERVER1 /T:DC /P:ADVANCED /M:32
           /A:2 /U
 
    To run Tuning Assistant this way, all four of the Advanced version
    configuration files must be located in the current subdirectory with
    LS40TUNE. This will run on a machine with or without a server installed.
    The command line values will take precedence over any actual system version
    of these parameters. If the "Apply" pushbutton is chosen the user supplied
    files will be changed and no backup files will be made.
 
    Example 2
 
           LS40TUNE /M:32 /A:2
 
    This will run only on a machine with a server installed. The command line
    values will take precedence over actual system versions of these
    parameters. This example could be useful to look at the effects of more
    system memory or network interface cards on the tuning calculations. When
    the "Apply" pushbutton is chosen the system configuration files will be
    changed and backup files will be copied into the \IBMLAN\BACKUP
    subdirectory with names like IBMLAN.001, PROTOCOL.001, etc. All files that
    are updated when the Tuning Assistant calculation is 'applied' will have
    the same suffix.
 
    Warning: The "What if" feature is useful in examining the logic of the
    Tuning Assistant, but you should be careful when creating actual
    configuration files for use on systems other than the one on which
    the tool was executed.
 
 
LAN Server 4.0 Configuration Defaults
 
  The Advanced version of LS 4.0 may be used in larger configurations than
  previous versions. Therefore, default values of a number of parameters have
  been increased. The objective is to allow many users to run LS 4.0 out-of-box
  with little or no customized tuning. The Advanced Server will support 100
  users in typical environments, however, running the Tuning Assistant may
  provide an additional performance improvement for some customers. Some
  changes to the Entry Server and Peer Services were also made.  A summary
  follows:
 
   IBMLAN.INI           ADVANCED         ENTRY          PEER
   PARAMETERS            SERVER         SERVER        SERVICES
 
                       LS 3.0  LS 4.0  LS 3.0  LS 4.0  LS 3.0  LS 4.0
 
  | maxopens         |  576  |  256  |  576  |  160  |  576  |  128  |
  |--------------------------------------------------------------------
  | maxsearches      |   50  |  350  |   50  |  150  |  50   |   50  |
  ---------------------------------------------------------------------
  |  numbigbuf       |   12  |   12  |   12  |    6  |  12   |    4  |
  ---------------------------------------------------------------------
  |  numreqbuf       |   36  |  250  |   36  |   48  |  36   |   10  |
  ---------------------------------------------------------------------
  |  maxshares       |   16  |  192  |   16  |   64  |  16   |   16  |
  ---------------------------------------------------------------------
  |  maxusers        |   32  |  101  |   32  |   32  |   5   |    5  |
  ---------------------------------------------------------------------
  |  maxconnections  |  128  |  300  |  128  |  128  |  26   |   26  |
  ---------------------------------------------------------------------
  |  x1(in net1)     |   32  |  102  |   32  |   34  |  32   |   34  |
  ---------------------------------------------------------------------
  | x2(in net1)      |   50  |  175  |   50  |   70  |  50   |   70  |
  ---------------------------------------------------------------------
  ---------------------------------------------------------------------
  |------------- ---------------|
  | PROTOCOL.INI| SAME FOR ALL  |
  | PARAMETERS  |    VERSIONS   |
  -------------------------------
  |             | LS 3.0| LS 4.0|
  -------------------------------
  | sessions    |   40  |  130  |
  -------------------------------
  | ncbs        |   95  |  225  |
  -------------------------------
 
  HPFS386 Cache Defaults
 
    The HPFS386 cachesize was specified in the IFS line in CONFIG.SYS in LS
    3.0. For LS 4.0 it is specified in the \IBM386FS\HPFS386.INI file with a
    line reading "cachesize = xxxx" in the FILESYSTEM section.
 
    The algorithm for determining the default HPFS386 cache size has also
    changed. Previously the cache size was set at 20 percent of the remaining
    memory after OS/2 was started. This gave a cache size of 2.9 MB on a 16 MB
    system. This formula will still be used as long as there is less than 20 MB
    of memory in the system. If the system has at least 20 MB of memory and the
    user has indicated that the server can use memory above 16 MB for cache,
    the default cache size will be 60 percent of remaining memory after OS/2
    has started. This will yield a cache size of around 18 MB on a 32MB system.
    This will enable LS 4.0 to provide excellent performance on most systems
    without any tuning.
 
    As with earlier releases of LAN Server, the USEALLMEM parameter defaults
    to 'NO'.  This restricts access to memory above 16 MB.  If no network
    interface cards (NICs) or disk adapters are limited to 24-bit direct memory
    access (DMA), and more than 16 MB of RAM is installed, this parameter
    should be set to 'YES'.  This parameter used to be in the CONFIG.SYS file
    but is now in the FILESYSTEM section of the HPFS386.INI file.
 
 
DOS LAN Services Client Performance Considerations
 
  OS/2 Lan Server 4.0 comes with DOS Lan Services (DLS) clients.  DLS clients
  offer substantial performance improvements over the DOS Lan Requester (DLR)
  clients provided with LAN Server 3.0. Significant performance benefit is
  realized through the implementation of client-side caching algorithms.  In
  brief, client-side caching offers local caching, reducing requests to the
  server, thereby increasing overall system performance.
 
  Client-side caching is enabled by default with DLS clients. This means that
  the AUTOCACHE parameter is set to YES in the NETWORK.INI file  (since this
  a default, it is not specifically written into NETWORK.INI during
  installation). With AUTOCACHE=YES, the DLS client will allocate big buffers
  in extended memory(XMS).  Each big buffer is 8K in size.  The number of big
  buffers is calculated by the system and is dependent on the amount of XMS
  available (up to a maximum of 30 big buffers).  If a machine does NOT have
  any XMS the AUTOCACHE parameter is effectively ignored.  If you want to
  configure big buffers on a DLS client that has no XMS, set the following
  parameters in the NETWORK.INI file:
 
         1.    AUTOCACHE=NO
         2.    SIZBIGBUF=xxxx (in bytes)
         3.    NUMBIGBUF=xx (integer)
 
  This will allocate big buffers on the client that can be used for large data
  transfers.  However, these buffers will be put in Upper and/or Conventional
  Memory, reducing available memory for applications.
 
  Another parameter of importance with DLS clients is the WORK BUFFERS.
  These are the buffers that are used on the requester to process an
  application's request for data. The default values for WORK BUFFERS on the
  DLS client are as follows (also set in NETWORK.INI).
 
         1.    SIZWORKBUF=1024
         2.    NUMWORKBUF=2
 
  The above default values are the generally recommended values for the best
  system performance. If you are unable to use the AUTOCACHE option, you may
  want to experiment with these two parameters for possible improvements in
  your environment.
 
 
LAN Server 4.0 Capacity Enhancements
 
  As the number of workstations connected to LAN Server 3.0 grew into the
  hundreds in some installations, an architectural limitation was discovered
  which has been addressed in LAN Server 4.0. Specifically, a data structure
  design was limiting the number of request buffers(NUMREQBUF) which could be
  configured to a maximum of around 350. In large installations this could
  cause a performance degradation. LAN Server's new design provides future
  extensibility by allowing the value of NUMREQBUF to be as large as 2000.
 
  The current recommended value for NUMREQBUF is 2.2 per user up to a
  maximum of 250 for each adapter, or 1000 if four adapters are in the system.
 
 
LAN Server 4.0 Support of SMP
 
  LAN Server 4.0 has been tested with and shown to support symmetric
  multi-processor(SMP) machines running under OS/2 for SMP. LAN Server 4.0
  Advanced does not gain additional performance benefits from SMP machines.
  Its architecture has been optimized to the point where most requests are
  processed "on interrupt" when received from the network component.  The
  queuing time for a request to be processed is usually extremely short since
  there are rarely instances when a file/print server's CPU approaches 100
  percent utilization. Under these conditions, it would not be expected that
  an additional CPU would improve response time to the requester.  This design
  provides industry leading performance as evidenced by the LANQuest** report
  of October 1994.  See page 25 for more information.
 
  There are some situations in which LAN Server 4.0 support of SMP does lead
  to an improvement in total system throughput performance. Since OS/2 is a
  multi-tasking operating system, other applications can run in the same
  machine as LAN Server. For other applications which make extensive use of
  the CPU(e.g. Lotus Notes, etc.), additional processors may make sense.
  Whenever the CPU workload approaches 100 percent, the additional processor
  can make a significant difference in the system throughput. LAN Server 4.0
  Advanced accommodates the use of the additional processor unless its own
  workload is unusually high in which case it takes precedence over other
  applications. LAN Server 4.0 Entry runs with the same privilege as other
  OS/2 applications and does not take precedence in an SMP environment.
 
 
NETBIOS Over TCP/IP
 
  Design Considerations
 
    NetBIOS over TCP/IP is an implementation of NetBIOS that has been
    specifically designed to operate with IBM TCP/IP.  It enables a
    workstation to be geographically isolated from its domain yet communicate
    with it transparently.
 
    NetBIOS over TCP/IP is an implementation of the Request for Comments (RFCs)
    1001/1002 standards which describe how to enable NetBIOS applications over
    TCP/IP.  It is a B-node, or Broadcast node implementation with routing
    extensions.  A broadcast node uses broadcasting to exchange information
    between hosts.  The routing extensions allow nodes to span subnets through
    IP routers. These extensions plus the remote name cache discussed below
    simplify the configuration of RFC 1001/1002 NetBIOS nodes into TCP/IP
    environments.
 
    NetBIOS over TCP/IP uses an expanded syntax for NetBIOS names that is
    transparent to NetBIOS applications.  The Local NetBIOS Name Scope String
    is appended to the NetBIOS name creating an expanded name that has the
    effect of limiting the scope of a NetBIOS name.  Two RFC-compliant NetBIOS
    nodes can communicate only if they have the same Local NetBIOS Name Scope.
    The Local NetBIOS Name Scope string is defined by the LOCALSCOPE parameter
    in the TCPBEUI section of the PROTOCOL.INI.
 
    NetBIOS over TCP/IP supports only 1 logical netbios adapter and should
    therefore be added to only 1 network interface card during the installation
    /configuration process.  However, if TCP/IP is installed on multiple
    adapters, NetBIOS over TCP/IP will make use of those adapters.
 
    TCPBEUI is IBM's high performance, ring zero protocol driver which maps
    NetBIOS API calls into the TCP/IP protocol. NetBIOS over TCP/IP contains
    enhancements over the RFC 1001/1002 standards which improve system
    performance by decreasing broadcast storms, and expanding communications
    over routers and bridges. These enhancements, described in the next
    section, are transparent to NetBIOS applications and do not interfere
    with other B-node implementations that lack similar functions.
 
  Enhancements
 
    Three of the enhancements to NetBIOS over TCP/IP are in the form of routing
    extensions.  These extensions allow communication between networks and over
    IP routers and bridges.  These extensions are:
 
    1.     The broadcast file.  A broadcast file contains a list of host names, host
           addresses, or directed broadcast addresses.  It is read at startup
           and each valid address is added to the set of destination addresses
           for broadcast packets. Remote nodes included in the broadcast file
           are then treated as if they were on the local network. Use of a
           broadcast file has the effect of extending a node's broadcast domain
           to its own subnet plus any other subnets listed in the broadcast
           file.  A maximum of 32 broadcast file entries are supported, each
           of which could include additional subnets, thus extending the node's
           broadcast domain.
 
    2.     The names file.  A names file consists of NetBIOS name and IP
           address pairs. NetBIOS over TCP/IP will conduct a prefix search of
           the names file before broadcasting on the network. The prefix match
           succeeds if the entry in the names file matches the given name, up
           to the length of the entry. The first match is used, therefore, the
           order in which NetBIOS names are listed in the names file is
           important.
 
           To enable this routing extension, set the NAMESFILE parameter in the
           TCPBEUI section of the PROTOCOL.INI to a nonzero integer that
           represents the number of names file  entries.
 
    3.     The Domain Name Server (DNS). A network administrator can maintain
           NetBIOS name and IP address pairs in a DNS.  If a name query fails
           NetBIOS over TCP/IP can append the NetBIOS Domain Scope String to
           the encoded NetBIOS name and issue a request to the DNS to look up
           an IP address for that NetBIOS name.
 
           The Domain Scope String is defined by the PROTOCOL.INI parameter
           DOMAINSCOPE.
 
    Another enhancement NetBIOS over TCP/IP provides is a cache for storing
    remote names that have been discovered.  This cache is enabled by setting
    the NAMECACHE parameter in the TCPBEUI section of the PROTOCOL.INI to a
    nonzero integer that represents the number of names stored in the directory
    (NAMECACHE=xx).
 
    The information in the remote names cache (or directory) is also stored on
    disk and periodically updated.  When the system is restarted, this
    information can be preloaded into the cache at bootup time.  Preloading
    can reduce the amount of broadcast frames on the network since NetBIOS
    will not have to rediscover names for remote names.  To preload the remote
    names cache, set PRELOADCACHE=YES in the TCPBEUI section of the
    PROTOCOL.INI.
 
    NOTE: When NetBIOS over TCP/IP is searching for a name, the name cache is
    checked first, then the names file, the broadcast file, and finally the
    Domain Name Server.
 
    Recommendation:  When running NetBIOS over TCP/IP in a Wide Area
    Network (WAN), turn name caching on at the server (e.g. NAMECACHE=100).
 
  Performance Characteristics of NetBIOS over TCP/IP
 
    The performance difference between NetBIOS over TCP/IP and NetBEUI can
    range widely depending on the environment.  Some environmental factors
    that can affect performance are the type of client (OS/2 or DOS), the
    server CPU workload, the type of network operations being performed, the
    network media, network congestion, and communication line speeds.  We've
    observed the performance of NetBIOS over TCP/IP being anywhere from 10%
    slower to as much as 4 times slower than NetBEUI.
 
    One of the environments in which we conducted performance tests was a
    medium- sized Local Area Network on 16Mbps Token Ring with no WAN
    connections. We ran a set of industry standard business applications on
    OS/2 NetBIOS over TCP/IP clients and again on OS/2 NetBEUI clients.  In
    this environment, NetBIOS over TCP/IP was 20% slower than NetBEUI.  The
    performance of DOS NetBIOS over TCP/IP clients was significantly less
    than that of the OS/2 clients.
 
    Database applications generally use small records when accessing shared
    databases residing on the server. Often these small records are retrieved
    from the file system cache with no physical disk access being required.
    The performance of this type of application on NetBIOS over TCP/IP may
    be noticeably slower than if the application were run using NetBEUI.
    However, if the number of database accesses of this type in performing a
    typical operation is in the order of hundreds, not thousands, the user may
    not notice a difference in performance in the two protocols.
 
    It may be necessary to periodically update client applications or other
    files by copying them from the server disk.  DCDB replication from a
    domain controller to a remote additional server also generates I/O
    operations sometimes known as file transfers. This type of file I/O
    activity over a network will show little or no performance difference
    between NetBEUI and NetBIOS over TCP/IP due to protocol characteristics.
    One should be aware, however, that most WAN connections today are made
    over relatively low speed communication lines when compared with a LAN
    speed of 4 to 16 Mbps. File transfer operations over WAN communication
    lines will probably be slower than over LANs but most likely not due to
    the network protocol.
 
  Tuning TCP/IP
 
    If you're using NetBIOS over TCP/IP in a Local Area Network environment,
    file transfer performance might be improved by increasing the maximum
    transmissible unit (MTU) size.  We have seen up to a 20 percent increase
    in performance of large file transfers by using an 8KB packet instead of
    the default 1500 bytes. The default of 1500 was chosen because of
    ethernet's packet size limitation and prevalence in TCP/IP environments.
 
    The MTU size can be changed with the IFCONFIG command in TCP/IP's
    SETUP.CMD.  Set the MTU size to the desired packet size plus 40 bytes, the
    maximum TCP/IP header size.  The desired packet size should be a multiple
    of 2048.
 
    Your network adapter must be configured to support transmission of buffers
    that are at least the size specified for the MTU. On an IBM 16/4 Token Ring adapter,
    this would be accomplished by setting the XMITBUFSIZE parameter in the
    Token Ring section of the PROTOCOL.INI file.  Check your network interface
    card documentation for information on configuring your adapter.
 
  Recommendation: Dual Protocol Stack
 
    Because there may be a performance difference in a particular environment,
    it is recommended to configure and use NetBEUI in the Local Area Network
    (LAN) environment, and NetBIOS over TCP/IP in the Wide Area Network (WAN)
    environment.  The Multi-Protocol Transport Services (MPTS) shipped with LAN
    Server 4.0 provides the capability of configuring your LAN workstation or
    server with both NetBEUI and NetBIOS over TCP/IP on the same network
    interface card.
 
    The dual protocol stack can be configured through the LAN Server
    installation/configuration program.  When selecting protocols, install
    logical adapter 0 with NetBEUI and logical adapter 1 with TCP/IP and
    NetBIOS over TCP/IP.  This dual protocol stack configuration allows local
    sessions to continue running with NetBEUI performance while also providing
    Wide Area Network connectivity with NetBIOS for TCP/IP.
 
 
 
Additional Useful Information
 
  Reducing NetBIOS Broadcast Frames
 
    A key concern with many NetBIOS users is the amount of broadcast traffic
    that occurs on the network.  Broadcasts are used to communicate between
    nodes. Broadcast storms can slow network performance and overwhelm routers.
    Use of the Remote Name Directory (RND) function can help to minimize this
    broadcasting by sending frames to specific nodes when possible.
 
    When using RND, the local station caches the node addresses of remote names
    that it has located.  Any messages sent to that remote name after the node
    address has been saved is sent directly to that node rather than broadcast
    to all nodes.
 
    The RND function in LAN Server 4.0 has been extended to include datagrams.
    RND stores only unique names and no group names, so if an application uses
    mostly group names for sending datagrams, RND should not be used.  Another
    enhancement to the RND function is that the maximum number of directory
    entries has been increased from 255 to 2000 when running on OS/2 2.0 or
    greater.
 
    The parameter RNDOPTION in the NETBEUI section of the PROTOCOL.INI
    specifies whether RND is turned on or off.  Set this parameter to 1 to
    enable use of the RND function.  If RNDOPTION is chosen, make sure that
    DATAGRAMPACKETS in the NETBEUI section is greater than 2. A related
    parameter, also found in the NETBEUI section, is NAMECACHE. This parameter
    specifies the size of the remote name directory.  This parameter defaults
    to 1000 entries.
 
  DCDB Replication Performance
 
    Changes to the DCDB Replicator service for LS 4.0 have yielded substantial
    performance improvements.  In some configurations, users may see up to an
    80 percent increase in performance over the LS 3.0 DCDB Replicator service.
 
  Upgrading from LAN Server 3.0
 
    Upgrading from LAN Server 3.0 to LAN Server 4.0 will cause parameters in
    the PROTOCOL.INI file to be set to the LS 4.0 default values.  This may
    cause performance problems in previously tuned servers.  Users who have
    fine-tuned their PROTOCOL.INI for LS 3.0 should be aware that they may
    need to make the same changes for LS 4.0.
 
  Considerations when RAW SMBs are disabled
 
    The multiplex read and write SMB protocols are used if the RAW SMB protocol
    is disabled.  These protocols divide data transfers into buffer-size chunks
    (sizworkbuf) and chain them together to satisfy large read or write
    requests.
 
    A parameter that affects performance when working in multiplex mode is
    PIGGYBACKACK in the NetBEUI section of the PROTOCOL.INI file.  This
    parameter specifies whether NetBIOS sends and receives acknowledgements
    piggybacked with incoming data.  When used with RAW SMBs, piggybackacks
    improve performance.  However, users that attempt to use piggybackacks with
    multiplex SMBs may see performance degrade by up to 3 times for large file
    transfers.
 
    Note:  The RAW SMB protocol is disabled on a server when srvheuristic 19 in
    the IBMLAN.INI file is set to 0 (default=1).  The RAW SMB protocol on an
    OS/2 client is disabled when IBMLAN.INI wrkheuristic 11 is set to 0
    (default=1) and wrkheuristics 14 and 15 are set to 1 (default=1).
 
  DOS TCP/IP
 
    The LAN Server Performance Team has tested a number of vendor TCP/IP
    products for DOS.  These include Network Telesystems, Wollongong, and FTP
    TCP/IP offerings.  In many cases, these performed considerably better than
    the IBM TCP/IP protocol stack shipped with LAN Server 4.0.  The Network
    Telesystems product, in particular, showed significant throughput
    improvement.
 
    While IBM continues to refine their DOS TCP/IP offering, the performance
    content of each of the OEM products reviewed may provide a near-term
    solution for running DOS clients in a TCP/IP environment.
 
    In addition to the TCP/IP protocol stack, each of the vendor products
    includes the normal TCP/IP applications such as FTP, mail, SNMP, etc.
 
  Configuring DOS LAN Services with Windows for Workgroups
 
    You can install both Windows for Workgroups and DOS LAN Services on the
    same workstation.  However, you cannot use the network function of Windows
    for Workgroups with this configuration.  To run DOS LAN Services and
    Windows for Workgroups on the same workstation, use the following
    procedure:
 
           1.  Install Windows for Workgroups
 
           2.  Install DOS LAN Service
 
           3.  In the WINDOWS\SYSTEM directory, rename the following files:
                 From:               To:
                 VNETSUP.386         VNETSUP.WFW
                 VREDIR.386          VREDIR.WFW
                 NETAPI.DLL          NETAPI.WFW
                 PMSPL.DLL           PMSPL.WFW
 
           4.  In the CONFIG.SYS file, REM out the following line:
                 'DEVICE=C:\WINDOWS\IFSHELP.SYS'
 
           5.  In the Windows SYSTEM.INI file, under the '386enh' section,
               change the line that contains the 'network=' statement to the
               following: 'network=vnetbios.386,vnetsup.386,vredir.386'
 
    The fix for APAR IC08963 makes the same changes, so you can use the APAR
    fix if you do not want to change the CONFIG.SYS and SYSTEM.INI files
    manually.
 
 
Additional Tips for LAN Server 4.0 Performance
 
  A number of the major factors affecting performance of LAN Server 4.0 are
  reviewed in the following sections. Although a few parameters are discussed,
  most of the tips are aimed at getting you to think about your particular
  environment in relation to LAN Server's system resources.
 
  Because there will always be a bottleneck in any computer system, the
  objective of performance tuning is to remove the current bottleneck.
  Hopefully the resulting system performance has its new bottleneck at an
  operating point outside normal operating conditions.
 
  Entry vs. Advanced Server
 
    If your LAN Server is to share file, applications, or printers for less
    than 80 users, the Entry Server will fit your needs with very good
    performance.  The LANQuest report described on page 22 contains a
    comparison of Entry vs. Advanced Server performance.  A subsequent upgrade
    to Advanced Server is available with minimum impact to your business.
 
    If your immediate requirements are for high performance and high capacity,
    you will want the Advanced Server.  To gain the performance advantage of
    the Advanced version, your applications and data files must reside on an
    HPFS386 partition, not on a FAT partition. Neither OS/2 2.1 nor LS 4.0
    must be installed on an HPFS386 partition because accesses to system
    software are infrequent after initial loading.
 
  Fixed Disk Utilization
 
    The disk subsystem, an electromechanical device, can often be the system
    bottleneck even though the system provides a lot of memory for caching
    files.  If you have observed that your fixed disk activity indicator (the
    little light that flashes when the hard disk is in use) is on more than it
    is off for long periods of time, you probably have a disk bottleneck.
    Your options for improving performance include:
 
    -      Distribute the disk-intensive workload from a single physical disk
           drive to multiple disk drives, enabling concurrent disk seeks and
           read/writes.
 
    -      Off-load some users, files, or applications to another server.
 
    -      Install the Fault Tolerance feature of LAN Server to enable disk
           mirroring. This not only protects your data by backing up your disk
           but also improves performance since the additional disk drive will
           also be used to read data (split reads).
 
    -      Adding fixed disks and striping data across them (RAID architecture)
           will sometimes improve performance as well as enhance data integrity
           in an environment where data is   predominantly looked up (read)
           without a subsequent update (write), for example, databases used
           for price lookup, part number information, etc.
 
  CPU Utilization
 
    Server performance can degrade when the computer (CPU)'s ability to process
    incoming instructions is overtaxed. If there are many users (usually
    hundreds) with high interaction rates to the server, a CPU performance
    bottleneck may occur (the Advanced server CPU efficiency is several times
    greater than the Entry server). You may see a lot of fixed-disk activity
    and suspect the disk subsystem, but this may be lazy-write activity, which
    is not necessarily the system bottleneck.  To check CPU utilization you
    can use System Performance Monitor/2 or LAN NetView* Monitor for a
    detailed analysis.
 
    To get a rough idea of how your server uses the CPU, start the Pulse
    applet from the OS/2 Workplace Shell* Productivity folder and observe its
    display during a heavy server workload period.  If the CPU utilization
    level is 80 percent or greater for much of the time, performance is being
    impacted by the CPU's ability to satisfy its workload demands.  Replacing
    standard network interface cards (NICs) with busmaster NICs will provide
    additional CPU power and usually improve server performance.  Another
    remedy is to offload some of the users, files, applications, or functions
    (e.g., domain controller or print server) to another server or to upgrade
    to a more powerful hardware system.
 
  Network Interface Cards (NICs)
 
    Let's assume that your fixed disk activity is not excessive and that your
    CPU utilization is generally less than 30 to 40 percent, but you still
    feel that your server could respond more quickly. Your network interface
    card (NIC) is analogous to a nozzle which physically limits the amount of
    traffic flowing to/from the server.  Depending on the number of users,
    speed of the client machines, type of data transactions, etc., server
    performance can be NIC-limited. NICs come in 8-bit, 16-bit and 32-bit bus
    widths.  Some 32-bit NICs are busmasters, which means they can handle most
    data transfers with their built-in processors, relieving the server CPU of
    this task.
 
    You can improve an NIC-limited condition by changing to a faster NIC and/or
    adding additional NICs to your server.  As you add additional NICs, your
    server CPU utilization will increase as the server will be busier than
    before servicing the additional traffic coming through the NICs (nozzles).
    If you add busmaster NICs, the increase in server CPU utilization will be
    less significant, as you might expect. LS 4.0 will automatically load-
    balance sessions across all NICs when you initiate a session. When using
    standard 16/4 token-ring NICs, we recommend that you use a 16 KB shared
    RAM size for best performance and memory utilization.
 
    Both LAN Server versions 3.0 and 4.0 now support more OEM NICs than the
    initial release of LAN Server 3.0.  You may obtain the current lists of
    supported NICs from CompuServe** with the following selections:
 
           1.    GO IBM
           2.    Technical Service and Support
           3.    IBM OS/2 Forums
           4.    OS/2 Developer 2 Forum (Browse)
           5.    LAN Server Library
 
  Network Media Utilization
 
    The physical media over which network traffic flows has a finite capacity.
    The Ethernet bandwidth limit today is usually 10 megabits per sec (Mbps);
    token rings today are running at 4 Mbps or 16 Mbps.  It is quite possible
    that with powerful servers and hundreds of clients, LANs can almost
    saturate the physical media providing interconnection.  This is much more
    likely to occur in Ethernet networks due to the broadcast/collision
    detection/re-broadcast nature of that architecture.
 
    In large networks interconnecting many clients and servers, the level of
    network traffic on the wire can impact token- ring network performance.
    A simple (but not always viable) remedy is to change your network topology. You could add
    NICs to your server, and separate and isolate clients into LAN segments so
    that all network traffic is not passing through all machines.  The net
    effect is that the server with two Ethernet NICs now has a greater
    potential bandwidth (20 Mbps) plus a lower collision level on each of the
    two segments than on a single Ethernet segment.
 
    This solution is not viable if the machines on the two isolated segments
    must communicate, since LS 4.0 does not internally route the NetBIOS
    protocol.  More sophisticated ways to reduce network utilization include
    using the traditional backbone rings and bridges plus the new intelligent
    switches, hubs, and routers now becoming available.
 
 
Performance Benchmark Comparison
 
  In October 1994 LANQuest Labs published a Performance Benchmark
  Comparison Report assessing the performance of LAN Server 4.0 Advanced and
  Entry, Windows NT** Server 3.5, and NetWare** 4.02.  The results of this
  benchmarking showed that LAN Server 4.0 Advanced was 38% faster than
  Windows NT Server and 11% faster than NetWare.  For copies of this report,
  call 1-800-IBM-4FAX and request document 2014.
 
 
  Trademarks denoted by an asterisk (*) are IBM trademarks or registered
  trademarks of the IBM Corporation in the United States and/or other
  countries:  IBM, OS/2, DB2, NetView, Workplace Shell
 
  Trademarks denoted by a double asterisk (**) are registered trademarks of
  their respective companies.