tcp(4p) 맨 페이지 - 윈디하나의 솔라나라

개요

섹션
맨 페이지 이름
검색(S)

tcp(4p)

tcp(4P)                        Network Protocols                       tcp(4P)



NAME
       tcp, TCP - Internet Transmission Control Protocol

SYNOPSIS
       #include <sys/socket.h>


       #include <netinet/in.h>


       s = socket(AF_INET, SOCK_STREAM, 0);


       s = socket(AF_INET6, SOCK_STREAM, 0);


       t = t_open("/dev/tcp", O_RDWR);


       t = t_open("/dev/tcp6", O_RDWR);

DESCRIPTION
       TCP is the virtual circuit protocol of the Internet protocol family. It
       provides reliable, flow-controlled, in order, two-way  transmission  of
       data.  It is a byte-stream protocol layered above the Internet Protocol
       (IP), or the Internet Protocol Version 6 (IPv6), the Internet  protocol
       family's internetwork datagram delivery protocol.


       Programs  can  access  TCP  using the socket interface as a SOCK_STREAM
       socket type, or using the Transport Level Interface (TLI) where it sup‐
       ports the connection-oriented (T_COTS_ORD) service type.


       TCP  uses  IP's host-level addressing and adds its own per-host collec‐
       tion of "port addresses." The endpoints of a TCP connection are identi‐
       fied by the combination of an IP or IPv6 address and a TCP port number.
       Although other protocols, such as the User Datagram Protocol (UDP), can
       use the same host and port address format, the port space of these pro‐
       tocols is distinct. See inet(4P) and inet6(4P) for details on the  com‐
       mon aspects of addressing in the Internet protocol family.


       Sockets  utilizing TCP are either "active" or "passive." Active sockets
       initiate connections to passive sockets. Both  types  of  sockets  must
       have  their local IP or IPv6 address and TCP port number bound with the
       bind(3C) system call after the socket is created. By default, TCP sock‐
       ets  are  active. A passive socket is created by calling the listen(3C)
       system call after binding the socket with bind().  This  establishes  a
       queueing  parameter  for the passive socket. After this, connections to
       the passive socket can be received with  the  accept(3C)  system  call.
       Active  sockets use the connect(3C) call after binding to initiate con‐
       nections.


       By using the special value INADDR_ANY  with   IP,  or  the  unspecified
       address  (all  zeroes)  with  IPv6,  the  local  IP address can be left
       unspecified in the bind() call by either active or passive TCP sockets.
       This  feature is usually used if the local address is either unknown or
       irrelevant. If left unspecified, the local IP or IPv6 address is  bound
       at connection time to the address of the network interface used to ser‐
       vice the connection.


       No two TCP sockets can be bound to the same port unless  the  bound  IP
       addresses  are  different.  This  behavior  can be changed by using the
       SO_REUSEPORT option. If both the binding  and  existing  bound  sockets
       have  this  option enabled, and the user IDs of both sockets (at bind()
       calling time) are the same, then such bind() is allowed. But  only  one
       of the two sockets can become a listener socket.


       When  comparing  addresses  at  bind()  time,  IPv4 INADDR_ANY and IPv6
       unspecified addresses compare as equal to any IPv4 or IPv6 address. For
       example,  if a socket is bound to INADDR_ANY or unspecified address and
       port X, no other socket can bind to port X, regardless of  the  binding
       address.  This  special  consideration  of  INADDR_ANY  and unspecified
       address can  be  changed  using  the  socket  option  SO_REUSEADDR.  If
       SO_REUSEADDR  is set on a socket doing a bind, IPv4 INADDR_ANY and IPv6
       unspecified address do not compare as equal to  any  IP  address.  This
       means  that  as  long  as  the  two  sockets  are  not  both  bound  to
       INADDR_ANY/unspecified address or the same IP address, the two  sockets
       can be bound to the same port.


       If  an  application  does  not  want  to allow another socket using the
       SO_REUSEADDR/SO_REUSEPORT option to bind to a port its socket is  bound
       to,  the  application  can set the socket level option SO_EXCLBIND on a
       socket. The option values of 0 and 1 mean enabling  and  disabling  the
       option  respectively. Once this option is enabled on a socket, no other
       socket can be bound to the same port.


       Once a connection has been established, data can be exchanged using the
       read(2) and write(2) system calls.


       Under  most  circumstances,  TCP  sends data when it is presented. When
       outstanding data has not yet  been  acknowledged,   TCP  gathers  small
       amounts of output to be sent in a single packet once an acknowledgement
       has been received. For a small number of clients, such as  window  sys‐
       tems  that send a stream of mouse events which receive no replies, this
       packetization can cause significant delays. To circumvent this problem,
       TCP provides a socket-level boolean option, TCP_NODELAY. TCP_NODELAY is
       defined in <netinet/tcp.h>, and is set with setsockopt(3C)  and  tested
       with  getsockopt(3C). The option level for the setsockopt() call is the
       protocol number for TCP, available from getprotobyname(3C).


       For some applications, it can be desirable for TCP not to send out data
       unless  a  full  TCP  segment  can be sent. To enable this behavior, an
       application can use the TCP_CORK socket option. When  TCP_CORK  is  set
       with  a  non-zero  value,  TCP  sends out a full TCP segment only. When
       TCP_CORK is set to zero after it has been enabled, all buffered data is
       sent  out  (as  permitted  by the peer's receive window and the current
       congestion window). TCP_CORK is defined in <netinet/tcp.h>, and is  set
       with  setsockopt(3C)  and  tested with getsockopt(3C). The option level
       for the setsockopt() call is the protocol  number  for  TCP,  available
       from getprotobyname(3C).


       The  TCP_MAXSEG  socket option can be used to determine the TCP maximum
       segment size (MSS) that will be used by a socket. When set after a con‐
       nection has been completed, the TCP_MAXSEG socket option will limit the
       sizes of the segments being sent. Only reductions in the  segment  size
       are allowed; this should never increase the size of segments.


       When  set before a connection is begun (by either the listen(3C) or the
       connect(3C) call) the option will additionally specify the size sent in
       the  outgoing TCP  MSS option on the SYN segment; this can be useful in
       dealing with networks that impose unusual restrictions on packet  size.
       If  the  user-specified  value is larger than the value that would have
       been sent otherwise, the smaller value is used.


       When retrieved before a connection is completed, the TCP_MAXSEG  socket
       option  will  return  the default segment size that will be used if the
       peer does not send an MSS option (536 for IPv4, 1220  for  IPv6).  When
       retrieved  after the connection is completed, the value returned is the
       current maximum segment size used by the  stack.  This  may  vary  over
       time,  due to Path MTU Discovery, but will never exceed any user-speci‐
       fied TCP_MAXSEG value.


       TCP_MAXSEG is defined in <netinet/tcp.h>,  and  is  set  with  setsock‐
       opt(3C)  and  retrieved  with  getsockopt(3C). The option level for the
       setsockopt() call is the protocol number for TCP, available  from  get‐
       protobyname(3C). The option value is an int.


       The  TCP_MD5SIG  socket option can be set with setsockopt(3C) to sign a
       TCP segment using MD5 hash as  described  in  RFC  2385.  Setting  this
       option  has an adverse effect on performance. This option is turned off
       by default. Setting TCP_MD5SIG option when no SAs have been  configured
       is  not  supported, currently a connection request will proceed without
       MD5 option but an incoming connection will not be accepted.  TCP_MD5SIG
       option  is applied only if set before initiating or accepting a connec‐
       tion.


       Another socket level option, SO_RCVBUF, can be used to control the win‐
       dow  that TCP advertises to the peer. IP level options can also be used
       with TCP. See ip(4P) and ip6(4P).


       TCP provides an urgent data mechanism, which can be invoked  using  the
       out-of-band  provisions  of  send(3C).  The caller can mark one byte as
       "urgent" with the MSG_OOB  flag  to  send(3C).  This  sets  an  "urgent
       pointer"  pointing  to this byte in the TCP stream. The receiver on the
       other side of the stream is notified of the urgent  data  by  a  SIGURG
       signal.  The  SIOCATMARK   ioctl(2)  request returns a value indicating
       whether the stream is at the urgent  mark.  Because  the  system  never
       returns  data  across  the  urgent mark in a single read(2) call, it is
       possible to advance to the urgent data in a  simple  loop  which  reads
       data, testing the socket with the SIOCATMARK  ioctl() request, until it
       reaches the mark.


       Incoming connection requests that include an IP source route option are
       noted, and the reverse source route is used in responding.


       A  checksum over all data helps TCP implement reliability. Using a win‐
       dow-based flow control mechanism that makes use  of  positive  acknowl‐
       edgements,  sequence  numbers,  and  a retransmission strategy, TCP can
       usually recover when datagrams  are  damaged,  delayed,  duplicated  or
       delivered out of order by the underlying communication medium.


       If  the  local  TCP  receives  no  acknowledgements from its peer for a
       period of time, (for example, if the remote machine crashes), the  con‐
       nection is closed and an error is returned.


       The    TCP   level   socket   options,   TCP_CONN_ABORT_THRESHOLD   and
       TCP_ABORT_THRESHOLD can be used to change and retrieve this  period  of
       time.  The  option  value  is  uint32_t  and  the  unit is millisecond.
       TCP_CONN_ABORT_THRESHOLD and TCP_ABORT_THRESHOLD  control  respectively
       this period before and after a connection is established. If the appli‐
       cation does not want TCP to time out, it can use the option value 0.


       During this period, TCP tries to  retransmit  the  unacknowledged  data
       multiple times, each after a timeout. And the timeout interval is expo‐
       nentially backed off. The TCP level  socket  options,  TCP_RTO_INITIAL,
       TCP_RTO_MIN,  and TCP_RTO_MAX can be used to control the timeout inter‐
       val.  TCP_RTO_INITIAL  controls  the  initial  retransmission   timeout
       period.  TCP_RTO_MIN  and  TCP_RTO_MAX  control the minimum and maximum
       timeout period respectively. The option value is an  uint32_t  and  the
       unit is millisecond.


       The  default  values  of  the  above options, TCP_CONN_ABORT_THRESHOLD,
       TCP_ABORT_THRESHOLD, TCP_RTO_MIN, TCP_RTO_MAX, and TCP_RTO_INITIAL  are
       appropriate for most situations. An application should only alter their
       values in special circumstances and when it has detailed  knowledge  of
       the network environment.


       TCP follows the congestion control algorithm described in RFC 2581, and
       also supports the initial congestion window (cwnd) changes in RFC
           3390. The initial cwnd calculation can be overridden by the  socket
       option  TCP_INIT_CWND.  An  application  can use this option to set the
       initial cwnd to a specified number of TCP segments. This applies to the
       cases  when  the  connection  first  starts  and restarts after an idle
       period. The process must have the PRIV_SYS_NET_CONFIG privilege  if  it
       wants to specify a number greater than that calculated by RFC 3390.


       The  TCP_INFO  option  can be used to collect various information about
       the current state of a TCP socket, such as  connection  state,  windows
       sizes,  and  so forth. The data structure used as an argument is struct
       tcp_info.


       The TCP_CONGESTION option can be used to get or set a socket's  conges‐
       tion  control algorithm. Its argument is a pointer to a null-terminated
       string.


       Oracle Solaris OS supports TCP Extensions  for  High  Performance  (RFC
       1323)  which includes the window scale and time stamp options, and Pro‐
       tection Against Wrap Around Sequence Numbers (PAWS). Oracle Solaris  OS
       also  supports  Selective Acknowledgment (SACK) capabilities (RFC 2018)
       and Explicit Congestion Notification (ECN) mechanism (RFC 3168).


       Turn on the window scale option in one of the following ways:

           o      An application can set SO_SNDBUF or SO_RCVBUF  size  in  the
                  setsockopt() option to be larger than 64K. This must be done
                  before the program calls listen() or connect(), because  the
                  window  scale  option  is  negotiated when the connection is
                  established. Once the connection has been made,  it  is  too
                  late  to  increase  the  send  or  receive window beyond the
                  default TCP limit of 64K.


           o      For all applications, use ndd(8) to modify the configuration
                  parameter  tcp_wscale_always. If tcp_wscale_always is set to
                  1, the window scale option is always set when connecting  to
                  a remote system. If tcp_wscale_always is 0, the window scale
                  option is set only if the  user  has  requested  a  send  or
                  receive  window  larger  than  64K.  The  default  value  of
                  tcp_wscale_always is 1.


           o      Regardless of the value  of  tcp_wscale_always,  the  window
                  scale option is always included in a connect acknowledgement
                  if the connecting system has used the option.



       Turn on SACK capabilities in the following way:

           o      Use ndd to modify the configuration parameter  tcp_sack_per‐
                  mitted.  If  tcp_sack_permitted  is  set  to 0, TCP does not
                  accept SACK or send out SACK information.  If  tcp_sack_per‐
                  mitted is set to 1,  TCP does not initiate a connection with
                  SACK permitted option in the SYN segment, but  does  respond
                  with  SACK  permitted  option  in  the SYN|ACK segment if an
                  incoming connection request has the SACK  permitted  option.
                  This  means  that  TCP  only accepts SACK information if the
                  other side of the connection also accepts SACK  information.
                  If  tcp_sack_permitted  is  set  to 2, it both initiates and
                  accepts connections with SACK information. The  default  for
                  tcp_sack_permitted is 2 (active enabled).



       Turn on TCP ECN mechanism in the following way:

           o      Use  ndd  to modify the configuration parameter tcp_ecn_per‐
                  mitted. If tcp_ecn_permitted is set to 0, TCP does not nego‐
                  tiate   with   a   peer  that  supports  ECN  mechanism.  If
                  tcp_ecn_permitted is set to 1 when initiating a  connection,
                  TCP  does  not  tell  a peer that it supports ECN mechanism.
                  However, it tells a peer that it supports ECN mechanism when
                  accepting  a  new  incoming  connection  request if the peer
                  indicates that it supports ECN mechanism in the SYN segment.
                  If tcp_ecn_permitted is set to 2, in addition to negotiating
                  with a peer on ECN mechanism when accepting connections, TCP
                  indicates  in  the outgoing SYN segment that it supports ECN
                  mechanism when TCP makes active  outgoing  connections.  The
                  default for tcp_ecn_permitted is 1.



       Turn on the time stamp option in the following way:

           o      Use    ndd    to    modify   the   configuration   parameter
                  tcp_tstamp_always. If tcp_tstamp_always is 1, the time stamp
                  option is always be set when connecting to a remote machine.
                  If tcp_tstamp_always is 0, the timestamp option  is  not  be
                  set  when  connecting  to  a  remote system. The default for
                  tcp_tstamp_always is 0.


           o      Regardless of the value of tcp_tstamp_always, the time stamp
                  option  is always included in a connect acknowledgement (and
                  all succeeding packets) if the connecting  system  has  used
                  the time stamp option.



       Use  the following procedure to turn on the time stamp option only when
       the window scale option is in effect:

           o      Use   ndd   to   modify    the    configuration    parameter
                  tcp_tstamp_if_wscale.   Setting  tcp_tstamp_if_wscale  to  1
                  causes the time stamp option to be set when connecting to  a
                  remote  system,  if the window scale option has been set. If
                  tcp_tstamp_if_wscale is 0, the time stamp option is not  set
                  when   connecting  to  a  remote  system.  The  default  for
                  tcp_tstamp_if_wscale is 1.



       Protection Against Wrap Around Sequence Numbers (PAWS) is  always  used
       when the time stamp option is set.


       Oracle  Solaris OS also supports multiple methods of generating initial
       sequence numbers. One of these methods is the improved  technique  sug‐
       gested  in  RFC  1948. We HIGHLY recommend that you set sequence number
       generation parameters as close to boot time as possible. This  prevents
       sequence number problems on connections that use the same connection-ID
       as ones that used a different sequence number generation. The svc:/net‐
       work/initial:default  service  configures  the  initial sequence number
       generation. The service reads the value contained in the  configuration
       file /etc/default/inetinit to determine which method to use.


       The /etc/default/inetinit file is an unstable interface, and can change
       in future releases.


       TCP can be configured to report some information  on  connections  that
       terminate by means of an RST packet. By default, no logging is done. If
       the ndd(8) parameter tcp_trace is set to 1, then  trace  data  is  col‐
       lected for all new connections established after that time.


       The  trace  data consists of the TCP headers and IP source and destina‐
       tion addresses of the last few packets sent in  each  direction  before
       RST occurred. Those packets are logged in a series of strlog(9F) calls.
       This trace facility has a very low overhead, and so is superior to such
       utilities  as snoop(8) for non-intrusive debugging for connections ter‐
       minating by means of an RST.


       Oracle Solaris OS supports the keep-alive mechanism described in RFC
           1122. It is enabled using  the  socket  option  SO_KEEPALIVE.  When
       enabled, there are two keep-alive machanisms.

           o      By  default,  the first keep-alive probe is sent out after a
                  TCP is idle for two hours. If the peer does not  respond  to
                  the  probe  within  eight  minutes,  the  TCP  connection is
                  aborted. You can alter the  interval  for  sending  out  the
                  first  probe using the socket option TCP_KEEPALIVE_THRESHOLD
                  in milliseconds or TCP_KEEPIDLE in seconds.

                  The system default is controlled by the TCP   ndd  parameter
                  tcp_keepalive_interval.  The  minimum  value is ten seconds.
                  The maximum is ten days, while the default is two hours.  If
                  you  receive  no  response  to  the  probe,  you can use the
                  TCP_KEEPALIVE_ABORT_THRESHOLD socket option  to  change  the
                  time threshold for aborting a TCP connection.

                  The option value is an unsigned integer in milliseconds. The
                  value zero indicates that TCP  should  never  time  out  and
                  abort  the  connection  when  probing. The system default is
                  controlled     by      the      TCP       ndd      parameter
                  tcp_keepalive_abort_interval. The default is eight minutes.


           o      The  second  implementation  is  activated  if socket option
                  TCP_KEEPINTVL and/or TCP_KEEPCNT are set. The  time  between
                  each  consequent  probes is set by TCP_KEEPINTVL in seconds.
                  The minimum value is ten seconds. The maximum is  ten  days,
                  while  the  default is two hours. The TCP connection will be
                  aborted after certain amount of  probes,  which  is  set  by
                  TCP_KEEPCNT, without receiving response.



       After  an  application closes a TCP connection, TCP enters the shutdown
       sequence. But if the peer does not respond (it crashes), the connection
       is stuck in this state (FIN-WAIT-2). To prevent this, Oracle Solaris OS
       starts a timer when TCP enters this state. If the timer fires  and  the
       shutdown  sequence  has  not  completed,  the  connection is freed. The
       socket option TCP_LINGER2 can be used to change and retrieve this time‐
       out  period.  The  option  value  is an int and the unit is second. The
       option value cannot be set higher than the system default value,  which
       is  controlled by the TCP private parameter tcp_fin_wait_2_flush_inter‐
       val. The default value is appropriate for most situations. An  applica‐
       tion  should  only  change  the value in some special circumstances and
       when it has detailed knowledge of the network environment.

SEE ALSO
       svcs(1),  ioctl(2),  read(2),  write(2),  accept(3C),  bind(3C),   con‐
       nect(3C),  getprotobyname(3C),  getsockopt(3C),  listen(3C),  send(3C),
       inet(4P), inet6(4P), ip(4P), ip6(4P), smf(7), ndd(8), svcadm(8)


       Ramakrishnan, K., Floyd, S., Black,  D.,  RFC  3168,  The  Addition  of
       Explicit Congestion Notification (ECN) to IP, September 2001.


       Mathis,  M. and Mahdavi, J. Pittsburgh Supercomputing Center; Floyd, S.
       Lawrence Berkeley National Laboratory; Romanow,  A.  Sun  Microsystems,
       Inc. RFC 2018, TCP Selective Acknowledgment Options, October 1996.


       Bellovin,  S., RFC 1948, Defending Against Sequence Number Attacks, May
       1996.


       Jacobson, V., Braden, R., and Borman, D., RFC 1323, TCP Extensions  for
       High Performance, May 1992.


       Postel,  Jon,  RFC  793, Transmission Control Protocol - DARPA Internet
       Program Protocol Specification, Network Information Center, SRI  Inter‐
       national, Menlo Park, CA., September 1981.

DIAGNOSTICS
       A socket operation can fail if:

       EISCONN          A  connect()  operation  was  attempted on a socket on
                        which a connect()  operation  had  already  been  per‐
                        formed.


       ETIMEDOUT        A  connection was dropped due to excessive retransmis‐
                        sions.


       ECONNRESET       The remote peer forced the  connection  to  be  closed
                        (usually  because  the  remote  machine has lost state
                        information about the connection due to a crash).


       ECONNREFUSED     The remote peer actively refused connection establish‐
                        ment  (usually  because no process is listening to the
                        port).


       EADDRINUSE       A bind() operation was attempted on a  socket  with  a
                        network  address/port pair that has already been bound
                        to another socket.


       EADDRNOTAVAIL    A bind() operation was attempted on a  socket  with  a
                        network address for which no network interface exists.


       EACCES           A  bind()  operation  was  attempted with a "reserved"
                        port number and the effective user ID of  the  process
                        was not the privileged user.


       ENOBUFS          The  system ran out of memory for internal data struc‐
                        tures.


NOTES
       The tcp service is managed by the service management facility,  smf(7),
       under the service identifier:

         svc:/network/initial:default



       Administrative actions on this service, such as enabling, disabling, or
       requesting restart, can be performed  using  svcadm(8).  The  service's
       status can be queried using the svcs(1) command.



Oracle Solaris 11.4               06 May 2016                          tcp(4P)
맨 페이지 내용의 저작권은 맨 페이지 작성자에게 있습니다.
RSS ATOM XHTML 5 CSS3