summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2011-11-29net: optimize socket timestampingEric Dumazet
We can test/set multiple bits from sk_flags at once, to shorten a bit socket setup/dismantle phase. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29net: dont call jump_label_dec from irq contextEric Dumazet
Igor Maravic reported an error caused by jump_label_dec() being called from IRQ context : BUG: sleeping function called from invalid context at kernel/mutex.c:271 in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper 1 lock held by swapper/0: #0: (&n->timer){+.-...}, at: [<ffffffff8107ce90>] call_timer_fn+0x0/0x340 Pid: 0, comm: swapper Not tainted 3.2.0-rc2-net-next-mpls+ #1 Call Trace: <IRQ> [<ffffffff8104f417>] __might_sleep+0x137/0x1f0 [<ffffffff816b9a2f>] mutex_lock_nested+0x2f/0x370 [<ffffffff810a89fd>] ? trace_hardirqs_off+0xd/0x10 [<ffffffff8109a37f>] ? local_clock+0x6f/0x80 [<ffffffff810a90a5>] ? lock_release_holdtime.part.22+0x15/0x1a0 [<ffffffff81557929>] ? sock_def_write_space+0x59/0x160 [<ffffffff815e936e>] ? arp_error_report+0x3e/0x90 [<ffffffff810969cd>] atomic_dec_and_mutex_lock+0x5d/0x80 [<ffffffff8112fc1d>] jump_label_dec+0x1d/0x50 [<ffffffff81566525>] net_disable_timestamp+0x15/0x20 [<ffffffff81557a75>] sock_disable_timestamp+0x45/0x50 [<ffffffff81557b00>] __sk_free+0x80/0x200 [<ffffffff815578d0>] ? sk_send_sigurg+0x70/0x70 [<ffffffff815e936e>] ? arp_error_report+0x3e/0x90 [<ffffffff81557cba>] sock_wfree+0x3a/0x70 [<ffffffff8155c2b0>] skb_release_head_state+0x70/0x120 [<ffffffff8155c0b6>] __kfree_skb+0x16/0x30 [<ffffffff8155c119>] kfree_skb+0x49/0x170 [<ffffffff815e936e>] arp_error_report+0x3e/0x90 [<ffffffff81575bd9>] neigh_invalidate+0x89/0xc0 [<ffffffff81578dbe>] neigh_timer_handler+0x9e/0x2a0 [<ffffffff81578d20>] ? neigh_update+0x640/0x640 [<ffffffff81073558>] __do_softirq+0xc8/0x3a0 Since jump_label_{inc|dec} must be called from process context only, we must defer jump_label_dec() if net_disable_timestamp() is called from interrupt context. Reported-by: Igor Maravic <igorm@etf.rs> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29NET: NETROM: Fix formatting.Ralf Baechle
The Linux coding style wants the return statement on its own line. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29NET: NETROM: Cleanup argument SIOCADDRT ioctl argument checking.Ralf Baechle
nr_route.ndigis is unsigned int so the nr_route.ndigis < 0 expression is never true and can be dropped. Doing the nr_ax25_dev_get call later allows the nr_route.ndigis test to bail out without having to dev_put. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: Thomas Osterried <thomas@osterried.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29NET: NETROM: When adding a route verify length of mnemonic string.Ralf Baechle
struct nr_route_struct's mnemonic permits a string of up to 7 bytes to be used. If userland passes a not zero terminated string to the kernel adding a node to the routing table might result in the kernel attempting to read copy a too long string. Mnemonic is part of the NET/ROM routing protocol; NET/ROM routing table updates only broadcast 6 bytes. The 7th byte in the mnemonic array exists only as a \0 termination character for the kernel code's convenience. Fixed by rejecting mnemonic strings that have no terminating \0 in the first 7 characters. Do this test only NETROM_NODE to avoid breaking NETROM_NEIGH where userland might passing an uninitialized mnemonic field. Initial patch by Dan Carpenter <dan.carpenter@oracle.com>. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Walter Harms <wharms@bfs.de> Cc: Thomas Osterried <thomas@osterried.de> Acked-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29NET: AX.25: Check ioctl arguments to avoid overflows further down the road.Ralf Baechle
Very large, nonsenical arguments or use in very extreme conditions could result in integer overflows. Check ioctls arguments to avoid such overflows and return -EINVAL for too large arguments. To allow the use of AX.25 for even the most extreme setup (think packet radio to the Phase 5E mars probe) we make no further attempt to clamp the argument range. Originally reported by Fan Long <longfancn@gmail.com> and a first patch was sent by Xi Wang <xi.wang@gmail.com>. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: Xi Wang <xi.wang@gmail.com> Cc: Joerg Reuter <jreuter@yaina.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Thomas Osterried <thomas@osterried.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29dsa: Move switch drivers to new directory drivers/net/dsaBen Hutchings
Support for specific hardware belongs under drivers/net/ not net/. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29dsa: Move all definitions needed by drivers into <net/dsa.h>Ben Hutchings
Any headers included by drivers should be under include/, and any definitions they use are not really private to the core as the name "dsa_priv.h" suggests. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29dsa: Remove unnecessary exportsBen Hutchings
I mistakenly exported functions from slave.c that are only called from dsa.c, part of the same module. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-28Merge branch 'for-davem' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
2011-11-28sch_sfb: use skb_flow_dissect()Eric Dumazet
Current SFB double hashing is not fulfilling SFB theory, if two flows share same rxhash value. Using skb_flow_dissect() permits to really have better hash dispersion, and get tunnelling support as well. Double hashing point was mentioned by Florian Westphal Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-28cls_flow: use skb_flow_dissect()Eric Dumazet
Instead of using a custom flow dissector, use skb_flow_dissect() and benefit from tunnelling support. This lack of tunnelling support was mentioned by Dan Siemon. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-28net: use skb_flow_dissect() in __skb_get_rxhash()Eric Dumazet
No functional changes. This uses the code we factorized in skb_flow_dissect() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-28net: introduce skb_flow_dissect()Eric Dumazet
We use at least two flow dissectors in network stack, with known limitations and code duplication. Introduce skb_flow_dissect() to factorize this, highly inspired from existing dissector from __skb_get_rxhash() Note : We extensively use skb_header_pointer(), this permits us to not touch skb at all. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-28tcp: tcp_sendmsg() wrong access to sk_route_capsEric Dumazet
Now sk_route_caps is u64, its dangerous to use an integer to store result of an AND operator. It wont work if NETIF_F_SG is moved on the upper part of u64. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-28Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
2011-11-27tcp: skip cwnd moderation in TCP_CA_Open in tcp_try_to_openNeal Cardwell
The problem: Senders were overriding cwnd values picked during an undo by calling tcp_moderate_cwnd() in tcp_try_to_open(). The fix: Don't moderate cwnd in tcp_try_to_open() if we're in TCP_CA_Open, since doing so is generally unnecessary and specifically would override a DSACK-based undo of a cwnd reduction made in fast recovery. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-27tcp: allow undo from reordered DSACKsNeal Cardwell
Previously, SACK-enabled connections hung around in TCP_CA_Disorder state while snd_una==high_seq, just waiting to accumulate DSACKs and hopefully undo a cwnd reduction. This could and did lead to the following unfortunate scenario: if some incoming ACKs advance snd_una beyond high_seq then we were setting undo_marker to 0 and moving to TCP_CA_Open, so if (due to reordering in the ACK return path) we shortly thereafter received a DSACK then we were no longer able to undo the cwnd reduction. The change: Simplify the congestion avoidance state machine by removing the behavior where SACK-enabled connections hung around in the TCP_CA_Disorder state just waiting for DSACKs. Instead, when snd_una advances to high_seq or beyond we typically move to TCP_CA_Open immediately and allow an undo in either TCP_CA_Open or TCP_CA_Disorder if we later receive enough DSACKs. Other patches in this series will provide other changes that are necessary to fully fix this problem. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-27tcp: use SACKs and DSACKs that arrive on ACKs below snd_unaNeal Cardwell
The bug: When the ACK field is below snd_una (which can happen when ACKs are reordered), senders ignored DSACKs (preventing undo) and did not call tcp_fastretrans_alert, so they did not increment prr_delivered to reflect newly-SACKed sequence ranges, and did not call tcp_xmit_retransmit_queue, thus passing up chances to send out more retransmitted and new packets based on any newly-SACKed packets. The change: When the ACK field is below snd_una (the "old_ack" goto label), call tcp_fastretrans_alert to allow undo based on any newly-arrived DSACKs and try to send out more packets based on newly-SACKed packets. Other patches in this series will provide other changes that are necessary to fully fix this problem. Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-27tcp: use DSACKs that arrive when packets_out is 0Neal Cardwell
The bug: Senders ignored DSACKs after recovery when there were no outstanding packets (a common scenario for HTTP servers). The change: when there are no outstanding packets (the "no_queue" goto label), call tcp_fastretrans_alert() in order to use DSACKs to undo congestion window reductions. Other patches in this series will provide other changes that are necessary to fully fix this problem. Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-27tcp: make is_dupack a parameter to tcp_fastretrans_alert()Neal Cardwell
Allow callers to decide whether an ACK is a duplicate ACK. This is a prerequisite to allowing fastretrans_alert to be called from new contexts, such as the no_queue and old_ack code paths, from which we have extra info that tells us whether an ACK is a dupack. Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-26atm: eliminate atm_guess_pdu2truesize()chas williams - CONTRACTOR
Signed-off-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-26dsa: Allow core and drivers to be built as modulesBen Hutchings
Change the kconfig types to tristate and adjust the condition for declaring net_device::dsa_ptr to allow for this. Adjust the makefile so that if NET_DSA_MV88E6123_61_65=y and NET_DSA_MV88E6131=m or vice versa then both drivers are built-in. We could leave these options as bool and make NET_DSA_MV88E6XXX a user-selected option, but that would break existing configurations. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-26dsa: Define module author, description, license and aliases for driversBen Hutchings
Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-26mv88e6xxx: Combine mv88e6131 and mv88e612_61_65 driversBen Hutchings
These drivers share a lot of code, so if we make them modular they should be built into the same module. Therefore, link them together and merge their respective module init and exit functions. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-26dsa: Combine core and tagging codeBen Hutchings
These files have circular dependencies, so if we make DSA modular then they must be built into the same module. Therefore, link them together and merge their respective module init and exit functions. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-26dsa: Export functions from core to modulesBen Hutchings
Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-26dsa: Change dsa_uses_{dsa, trailer}_tags() into inline functionsBen Hutchings
eth_type_trans() will use these functions if DSA is enabled, which blocks building DSA as a module. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: net/ipv4/inet_diag.c
2011-11-26ipv4: Don't use the cached pmtu informations for input routesSteffen Klassert
The pmtu informations on the inetpeer are visible for output and input routes. On packet forwarding, we might propagate a learned pmtu to the sender. As we update the pmtu informations of the inetpeer on demand, the original sender of the forwarded packets might never notice when the pmtu to that inetpeer increases. So use the mtu of the outgoing device on packet forwarding instead of the pmtu to the final destination. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-26net: Move mtu handling down to the protocol depended handlersSteffen Klassert
We move all mtu handling from dst_mtu() down to the protocol layer. So each protocol can implement the mtu handling in a different manner. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-26net: Rename the dst_opt default_mtu method to mtuSteffen Klassert
We plan to invoke the dst_opt->default_mtu() method unconditioally from dst_mtu(). So rename the method to dst_opt->mtu() to match the name with the new meaning. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-26route: Use the device mtu as the default for blackhole routesSteffen Klassert
As it is, we return null as the default mtu of blackhole routes. This may lead to a propagation of a bogus pmtu if the default_mtu method of a blackhole route is invoked. So return dst->dev->mtu as the default mtu instead. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-26Merge branch 'for_david' of git://git.open-mesh.org/linux-mergeDavid S. Miller
2011-11-25netns: fix proxy ARP entries listing on a netnsJorge Boncompte [DTI2]
Skip entries from foreign network namespaces. Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-25net/netlabel: copy and paste bug in netlbl_cfg_unlbl_map_add()Dan Carpenter
This was copy and pasted from the IPv4 code. We're calling the ip4 version of that function and map4 is NULL. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-23ipv4: Save nexthop address of LSRR/SSRR option to IPCB.Li Wei
We can not update iph->daddr in ip_options_rcv_srr(), It is too early. When some exception ocurred later (eg. in ip_forward() when goto sr_failed) we need the ip header be identical to the original one as ICMP need it. Add a field 'nexthop' in struct ip_options to save nexthop of LSRR or SSRR option. Signed-off-by: Li Wei <lw@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-23net: treewide use of RCU_INIT_POINTEREric Dumazet
rcu_assign_pointer(ptr, NULL) can be safely replaced by RCU_INIT_POINTER(ptr, NULL) (old rcu_assign_pointer() macro was testing the NULL value and could omit the smp_wmb(), but this had to be removed because of compiler warnings) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-23ipv4 : igmp : fix error handle in ip_mc_add_src()Jun Zhao
When add sources to interface failure, need to roll back the sfcount[MODE] to before state. We need to match it corresponding. Acked-by: David L Stevens <dlstevens@us.ibm.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jun Zhao <mypopydev@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-23ipv6: tcp: fix tcp_v6_conn_request()Eric Dumazet
Since linux 2.6.26 (commit c6aefafb7ec6 : Add IPv6 support to TCP SYN cookies), we can drop a SYN packet reusing a TIME_WAIT socket. (As a matter of fact we fail to send the SYNACK answer) As the client resends its SYN packet after a one second timeout, we accept it, because first packet removed the TIME_WAIT socket before being dropped. This probably explains why nobody ever noticed or complained. Reported-by: Jesse Young <jlyo@jlyo.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-23netfilter: Remove NOTRACK/RAW dependency on NETFILTER_ADVANCED.David S. Miller
Distributions are using this in their default scripts, so don't hide them behind the advanced setting. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-23ipv6: tcp: fix panic in SYN processingEric Dumazet
commit 72a3effaf633bc ([NET]: Size listen hash tables using backlog hint) added a bug allowing inet6_synq_hash() to return an out of bound array index, because of u16 overflow. Bug can happen if system admins set net.core.somaxconn & net.ipv4.tcp_max_syn_backlog sysctls to values greater than 65536 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-23ipv6: fix a bug in ndisc_send_redirectLi Wei
Release skb when transmit rate limit _not_ allow Signed-off-by: Li Wei <lw@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-22Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2011-11-22net: remove ipv6_addr_copy()Alexey Dobriyan
C assignment can handle struct in6_addr copying. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-22net: correct comments of skb_shiftFeng King
when skb_shift, we want to shift paged data from skb to tgt frag area. Original comments revert the shift order Signed-off-by: Feng King <kinwin2008@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-22atm: Allow MSG_PEEK for atm socketsJorge Boncompte [DTI2]
Now that the vcc backends do the right thing with respect the receive queue on registration, allow MSK_PEEK for atm sockets. This allows a userspace program to inspect the packets and decide what backend to use to handle them. Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-22atm: Introduce vcc_process_recv_queueJorge Boncompte [DTI2]
This function moves the implementation found in the clip and br2684 modules to common code, correctly unlinks the skb from the queue before pushing it and makes pppoatm use it. Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-22atm: clip: move clip_devs check to clip_pushJorge Boncompte [DTI2]
This will allow further cleanup. Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-22atm: clip: Don't move counters backwardsJorge Boncompte [DTI2]
I don't see the point on substracting the skb len from the netdev stats. Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net> Signed-off-by: David S. Miller <davem@davemloft.net>