summaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* Merge pull request #19157 from keszybz/read-medium-sized-virtual-filev248-2v248Lennart Poettering2021-03-302-21/+29
|\ | | | | basic/fileio: fix reading of not-too-small virtual files
| * test-fileio: test for read_full_virtual_file()Zbigniew Jędrzejewski-Szmek2021-03-301-0/+20
| | | | | | | | | | It was already called through other places, but indirectly. Let's add some direct invocations.
| * basic/fileio: fix reading of not-too-small virtual filesZbigniew Jędrzejewski-Szmek2021-03-301-21/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This code is trying to do two things: when reading a file with working st.st_size, detect when the file size changes between the fstat() and our allocation of the buffer based on the returned size, and the subsequent read(). When reading a file without st.st_size, read up to READ_FULL_BYTES_MAX. But this second scenario was partially broken: we'd start with size = 4095, and double the size up to three times, i.e. up to 32767. But we want to read up to READ_FULL_BYTES_MAX. So let's listentangle the two cases a bit: if a file returns non-zero st._size, proceed as before. But if we don't know the size, let's immediately allocate the buffer of maximum size of READ_FULL_BYTES_MAX. I think that allocating 4MB and 1MB is going to take pretty much the same time as long as the memory is not written to, so by allocating 1MB, 2MB, and 4MB, we wouldn't really be saving anything internally, but wasting time on repeated reads, if the file is long enough. Also, don't do the seek if we know we're going to return an error immediately after. This should fix reading of any files in /proc, which all have size == 0. In particular, various files read by coredump might be larger than 32767. What about /sys? The file there return a fake value, usually 4096. So we'll allocate a small buffer and read that.
* | Merge pull request #19149 from anitazha/oomdloggingLuca Boccassi2021-03-303-32/+85
|\ \ | | | | | | oomd: make it more clear when a kill happens
| * | oomd: fix iteration over candidates to killZbigniew Jędrzejewski-Szmek2021-03-301-10/+10
| | |
| * | oomd: make it more clear when a kill happensAnita Zhang2021-03-303-24/+77
| | | | | | | | | | | | | | | | | | | | | | | | Improve the logging to only print if systemd-oomd killed something. And also print which cgroup was targeted. Demote general swap above/pressure above messages to debug. [zjs: fix some issuelets found in review]
* | | Merge pull request #19131 from keszybz/resolvectl-warn-lessLennart Poettering2021-03-305-68/+124
|\ \ \ | |_|/ |/| | Suppress warnings in resolvectl about --type=
| * | resolvectl: suppress warning about --type for names with a dotZbigniew Jędrzejewski-Szmek2021-03-261-1/+1
| | | | | | | | | | | | | | | People don't generally type the trailing dot by mistake, so let's treat this as indication that they want to resolve this particular hostname.
| * | resolvectl: do not warn about single hostnames for names we synthesizeZbigniew Jędrzejewski-Szmek2021-03-261-1/+22
| | | | | | | | | | | | https://github.com/systemd/systemd/pull/17535#discussion_r534005801
| * | resolved: split out function to determine the local llmnr hostnameZbigniew Jędrzejewski-Szmek2021-03-264-64/+99
| | |
| * | resolvectl: reword note about "raw record types"Zbigniew Jędrzejewski-Szmek2021-03-261-3/+3
| |/ | | | | | | | | As noted in https://github.com/systemd/systemd/pull/17535#discussion_r534129256, "raw" is misleading in this context. Let's use a more descriptive term.
* | selinux: do not crash if policy becomes unavailable after reloadZbigniew Jędrzejewski-Szmek2021-03-301-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://bugzilla.redhat.com/show_bug.cgi?id=1944171 This was in F33, systemd-246.13, but the logic in the code didn't change. Thread 1 (Thread 0x7fb5f0341b80 (LWP 1974)): №0 selabel_lookup_common (rec=0x0, translating=0, key=0x55f616ac4750 "/run/user/1000/systemd/units/invocation:systemd-tmpfiles-clean.service", type=40960) at label.c:167 'rec' is the handle that we passed. №1 0x00007fb5f13ae87f in selabel_lookup_raw (rec=<optimized out>, con=con@entry=0x7fffef307380, key=key@entry=0x55f616ac4750 "/run/user/1000/systemd/units/invocation:systemd-tmpfiles-clean.service", type=type@entry=40960) at label.c:256 lr = <optimized out> 'rec' is passed through as is to selabel_lookup_common(). №2 0x00007fb5f1561b2d in selinux_create_file_prepare_abspath (abspath=0x55f616ac4750 "/run/user/1000/systemd/units/invocation:systemd-tmpfiles-clean.service", mode=40960) at ../src/basic/selinux-util.c:368 filecon = 0x0 r = <optimized out> __PRETTY_FUNCTION__ = "selinux_create_file_prepare_abspath" __func__ = "selinux_create_file_prepare_abspath" №3 0x00007fb5f1561ec3 in mac_selinux_create_file_prepare (path=<optimized out>, mode=40960) at ../src/basic/selinux-util.c:431 r = 0 abspath = 0x55f616ac4750 "/run/user/1000/systemd/units/invocation:systemd-tmpfiles-clean.service" __PRETTY_FUNCTION__ = "mac_selinux_create_file_prepare" We checked label_hnd != NULL, but then we apparently called avc_netlink_check_nb(), which reset label_hnd. Yay for global state! №4 0x00007fb5f1549950 in symlink_atomic_label (from=0x55f6169d8b50 "69a8dcf7a7ac46b29306f2fddbed3edc", to=0x55f616ab8380 "/run/user/1000/systemd/units/invocation:systemd-tmpfiles-clean.service") at ../src/basic/label.c:55 r = <optimized out> __PRETTY_FUNCTION__ = "symlink_atomic_label" In the logs: Mar 29 14:48:44 fedorapad.home systemd[1974]: selinux: avc: received policyload notice (seqno=2) Mar 29 14:48:44 fedorapad.home systemd[1974]: Failed to initialize SELinux labeling handle: No such file or directory Mar 29 14:48:44 fedorapad.home systemd[1974]: selinux: avc: received policyload notice (seqno=3) Mar 29 14:48:44 fedorapad.home systemd[1974]: selinux: avc: received setenforce notice (enforcing=0)
* | sd-bus: set retain attribute on BUS_ERROR_MAP_ELF_REGISTERFangrui Song2021-03-291-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LLD 13 and GNU ld 2.37 support -z start-stop-gc which allows garbage collection of C identifier name sections despite the __start_/__stop_ references. Simply set the retain attribute so that GCC 11 (if configure-time binutils is 2.36 or newer)/Clang 13 will set the SHF_GNU_RETAIN section attribute to prevent garbage collection. Without the patch, there are linker errors like the following with -z start-stop-gc. ``` ld: error: undefined symbol: __start_SYSTEMD_BUS_ERROR_MAP >>> referenced by bus-error.c:93 (../src/libsystemd/sd-bus/bus-error.c:93) >>> sd-bus_bus-error.c.o:(bus_error_name_to_errno) in archive src/libsystemd/libsystemd_static.a ```
* | Merge pull request #19116 from keszybz/readvirtualfile-optZbigniew Jędrzejewski-Szmek2021-03-293-23/+24
|\ \ | | | | | | Optimize read_full_virtual_file() and another coverity issue
| * | tests: drop calls to unsetenv SYSTEMD_MEMPOOLZbigniew Jędrzejewski-Szmek2021-03-262-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Coverity was complaining that we don't check the return value, which we stopped doing in 772e0a76f34914f6f81205e912e4744c6b23f704. But it seems that we don't want those calls at all. The test was originally added with the call in a6ee01caf3409ba9820e8824b9262fbac31a9f77, but I don't see why we should override this. If the user wants to execute the test with mempool disabled, we shouldn't ignore that. Coverity CID#1444464, CID#1444466.
| * | basic/fileio: use malloc_usable_size() to use all allocated memoryZbigniew Jędrzejewski-Szmek2021-03-261-0/+1
| | |
| * | basic/fileio: optimize buffer sizes in read_full_virtual_file()Zbigniew Jędrzejewski-Szmek2021-03-261-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We'd proceed rather inefficiently: the initial buffer size was LINE_MAX/2, i.e. only 1k. We can read 4k at the same cost. Also, we'd try to allocate 1025, 2049, 4097 bytes, i.e. always one higher than the power-of-two size. Effectively the allocation would be bigger, and we'd waste the additional space. So let's allocate aligned to the power-of-two size. size=4095, 8191, 16383, so we allocate 4k, 8k, 16k.
| * | basic/fileio: simplify calculation of buffer size in read_full_virtual_file()Zbigniew Jędrzejewski-Szmek2021-03-261-7/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | We'd first assign a value up to SSIZE_MAX, and then immediately check if we have a value bigger than READ_FULL_BYTES_MAX. This wasn't exactly wrong, but a bit roundabout. Let's immediately assign the value from the appropriate range or error out. Coverity CID#1450973.
* | | Merge pull request #19129 from keszybz/test-random-rangeZbigniew Jędrzejewski-Szmek2021-03-262-9/+57
|\ \ \ | | | | | | | | Test random_u64_range()
| * | | test-random-util: add stochastic test for random_u64_range()Zbigniew Jędrzejewski-Szmek2021-03-261-2/+50
| | | |
| * | | test-random-util: modernizationZbigniew Jędrzejewski-Szmek2021-03-261-6/+6
| | | |
| * | | basic/log: fix log_trace()Zbigniew Jędrzejewski-Szmek2021-03-261-1/+1
| | |/ | |/| | | | | | | log_trace() was always on. It's supposed to be opt-in.
* | | test-dhcp6-client: add one more assert on memory mappingLuca Boccassi2021-03-261-0/+1
| | | | | | | | | | | | | | | | | | | | | Same as 7489d0640a4864d4b47fd8fda77f8eb7cf2e3fe8, one more case that was missed. Coverity CID #1394277
* | | Merge pull request #19125 from keszybz/cat-configZbigniew Jędrzejewski-Szmek2021-03-267-0/+14
|\ \ \ | | | | | | | | config files: recommend systemd-analyze cat-config
| * | | config files: recommend systemd-analyze cat-configZbigniew Jędrzejewski-Szmek2021-03-267-0/+14
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds the same line to most of our .conf files. Not for systemd/user.conf though, since we can't correctly display it right now: $ systemd-analyze cat-config --user systemd/user.conf Option --user is not supported for cat-config right now. For sysusers.d, tmpfiles.d, rules.d, etc, there is no single file. Maybe we should short READMEs in /usr/lib/sysusers.d, /usr/lib/tmpfiles.d, etc.? Inspired by #19118.
* | | resolved: tweak how we signal authoritative answersLennart Poettering2021-03-263-5/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | let's make sure we set the "aa" bit in the stub only if we answer with fully authoritative data. For this ensure: 1. Either all data is synthetic, including all CNAME/DNAME redirects 2. Or all data comes from the local trust anchor or the local zones (i.e. not the network or the cache) Follow-up for 4ad017cda57b04b9d65e7da962806cfcc50b5f0c
* | | use the right member to define propertyDavid Tardon2021-03-261-1/+1
|/ /
* | Merge pull request #19112 from poettering/more-stub-fixesZbigniew Jędrzejewski-Szmek2021-03-255-117/+173
|\ \ | | | | | | resolved: two more tweaks to the stub
| * | resolved: rework CNAME logic a bit moreLennart Poettering2021-03-251-83/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When following CNAME/DNAME redirects in the stub we currently first iterate through the packet and pick up what we can use (in dns_stub_collect_answer_by_question() and friends), following all CNAMEs/DNAMEs, and would then issue dns_query_process_cname() to move the DnsQuery object forward too, where we'd then possibly restart the query and pick things up again, as above. There's one thought error in this though: dns_query_process_cname() tries to be smart and will internally follow not just a single CNAME/DNAME redirect, but a chain of them if they are contained inside the same packet until we reach the point where the answer is not included in the packet anymore, where we'd restart the query. This was great as long as we only focussed on the D-Bus and Varlink resolver APIs, since there the CNAME/DNAME chain in the middle doesn't actually matter, we just return information about the final name of the RR and its content, and aren't interested in the chain to it. For the DNS stub this is different however: there we need to place the full CNAME/DNAME chain (and all the appropriate metadata RRs) in the stub reply. Hence rework this so that we build on the fact that the previous commit split dns_query_process_cname() in two: 1. dns_query_process_cname_one() will do exactly one CNAME/DNAME redirect step. This will be called by the stub, so that we can pick up matching RRs for every single step along the way. 2. dns_query_process_cname_many() will follow a chain as long as that's possible within the same packet. It's thus pretty much identical to the old dns_query_process_cname() call. This is what we now use in the D-Bus and Varlink APIs. dns_query_process_cname_many() is basically just a loop around dns_query_process_cname_one(). Any logic to follow and pick up RRs manually in the stub along the CNAME/DNAME path is now dropped (i.e. dns_stub_collect_answer_by_question() becomes trivially simple again), we solely rely on dns_query_process_cname_one() to follow CNAME/DNAME now: each step followed by a full call of dns_stub_assign_sections() to copy out the RRs that matter. Net result: things are a bit simpler again, as the only place we follow CNAME/DNAME redirects is DnsQuery again, and stub answers are always complete: they contain all CNAME/DNAME RRs on the way including all their metadata we might pick up in the other sections.
| * | resolved: split dns_query_process_cname() into two separate functionsLennart Poettering2021-03-255-28/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This does some refactoring: the dns_query_process_cname() function becomes two: dns_query_process_cname_one() and dns_query_process_cname_many(). The former will process exactly one CNAME chain element, the latter will follow a chain for as long as possible within the current packet. dns_query_process_cname_many() is mostly identical to the old dns_query_process_cname(), and all existing code is moved over to using that. This is mostly preparation for the next commit, where we make direct use of dns_query_process_cname_one(). This also renames the DNS_QUERY_RESTARTED return value to DNS_QUERY_CNAME. That's because in the dns_query_process_cname_many() case as before if we return this we restarted the query in case we reached the end of the chain without a conclusive answer, as before. But in dns_query_process_cname_one() we'll only go one step anyway, and leave restarting if needed to the caller. Hence DNS_QUERY_RESTARTED is a bit of a misnomer in that case. This also gets rid of the weird tail recursion in dns_query_process_cname() and replaces it with an explicit loop in dns_query_process_cname_many(). The old recursion wasn't a security issue since we put a limit on the number of CNAMEs we follow anyway, but it's still icky to scale stack use by that.
| * | resolved: tweak sections we add answer RRs toLennart Poettering2021-03-251-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | Previously we'd stick all answer sections RRs we acquired into the authoritative section if we didn't find them directly answering our question. Let's put them into additional instead. The authoritative section should hence only include what comes from the upstream authoritative section, and nothing else.
* | | Merge pull request #19117 from bluca/coverityLuca Boccassi2021-03-252-0/+7
|\ \ \ | |/ / |/| | Two small coverity issues
| * | test-dhcp6-client: add one more assert on memory mappingLuca Boccassi2021-03-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Static analyzers need a hint that optval is not pointing off the end of the msg_advertise array, since pos can go up to the full length of it. The array is manually constructed so we know this won't happen, but adding one more assert should be enough to avoid false positives. Coverity CID #1394277
| * | test-firewall-util: add more asserts on allocated variablesLuca Boccassi2021-03-251-0/+6
| | | | | | | | | | | | | | | | | | Makes things nicer for readers, and hopefully gives static analyzers a hint on the origin/cleanup of the ctx pointer. Coverity CID #1451399
* | | resolved: pass mDNS reply packets to each transaction exactly onceLennart Poettering2021-03-251-17/+10
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously we'd iterate through the RRs of an mDNS reply and then find exactly one matching transaction on our scope for it, and pass it as reply to that. If multiple RRs of the same packet match we'd pas the packet multiple times to the transaction even. This all doesn't really work anymore since there can be multiple open transactions for the same key (with different flags), and it's kinda ugly anywy. Hence let's turn this around: let's iterate through the transactions and check if any of the included RRs match it, and if so pass the packet to that transaction exactly once. This speeds up mDNS a bit, since previously we'd oftentimes fail to find all suitable transactions for an mDNS reply (because there can be multiple transactions for the same RR key with different flags, and we checked exactly one flag combination). Which would then mean the transaction would time out, and be retried – at which point the cache would be populated and thus it would still succeed, but only after this timeout. With this fix this is corrected: every transaction that matches will get the reply, instantly as we get it.
* | resolved: upgrade log level to LOG_NOTICE if we switch to fallback server ↵Lennart Poettering2021-03-251-3/+4
| | | | | | | | | | | | | | | | | | | | (or back) This is inspired by a recent thread on fedora-devel: it's noteworthy when we switch to the fallback servers, since it might (or might not) indicate some configuration problem. Fixes: #18788
* | resolved: don't suppress OPT if we have no OPTLennart Poettering2021-03-251-1/+1
| | | | | | | | | | This is inspired by #18917. It suppresses a misleading log message about suppressing OPT where we might not actually have OPT.
* | Merge pull request #19076 from yuwata/firewall-util-modernizationsLuca Boccassi2021-03-246-201/+243
|\ \ | | | | | | firewall-util: modernize code and improve test
| * | test-firewall-util: use assert_se() at most placesYu Watanabe2021-03-251-78/+74
| | | | | | | | | | | | Otherwise, we cannot notice any failures...
| * | firewall-util: refuse IPv6 firewall rules when kernel does not support IPv6Yu Watanabe2021-03-231-3/+11
| | |
| * | firewall-util: gracefully handle -EOVERFLOW returned from older kernelYu Watanabe2021-03-231-0/+13
| | |
| * | firewall-util: do not use goto for retryingYu Watanabe2021-03-231-62/+70
| | |
| * | firewall-util: add missing return value checkYu Watanabe2021-03-231-2/+4
| | |
| * | firewall-util: probe firewall backend in fw_ctx_new()Yu Watanabe2021-03-231-12/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | FirewallContext is used by networkd and nspawn. Both allocates the context when it is really necessary. Hence, it is not necessary to delay probing backend. Moreover, if iptables backend is not enabled on build, and nftables is not supported by kernel, previously `fw_nftables_init()` is called everytime when we try to configure masquerade or dnat. It causes significant performance loss. Fixes test-firewall-util issue in #19052.
| * | network: allocate FirewallContext lazilyYu Watanabe2021-03-231-4/+0
| | |
| * | firewall-util: logs which backend will be usedYu Watanabe2021-03-233-44/+73
| | | | | | | | | | | | This also modernizes code a bit.
* | | local-addresses: fix use of uninitialized valueDavid Tardon2021-03-241-1/+1
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This can happen if ifi fails to be read from the netlink message and the error is ENODATA. Fixes the following valgrind message when running netstat: ==164141== Conditional jump or move depends on uninitialised value(s) ==164141== at 0x524AE60: address_compare (local-addresses.c:29) ==164141== by 0x48BCC78: msort_with_tmp.part.0 (msort.c:105) ==164141== by 0x48BC9E4: msort_with_tmp (msort.c:45) ==164141== by 0x48BC9E4: msort_with_tmp.part.0 (msort.c:53) ==164141== by 0x48BCF85: msort_with_tmp (msort.c:45) ==164141== by 0x48BCF85: qsort_r (msort.c:297) ==164141== by 0x52500FC: UnknownInlinedFun (sort-util.h:47) ==164141== by 0x52500FC: local_gateways.constprop.0 (local-addresses.c:310) ==164141== by 0x5251C05: _nss_myhostname_gethostbyaddr2_r (nss-myhostname.c:456) ==164141== by 0x5252006: _nss_myhostname_gethostbyaddr_r (nss-myhostname.c:500) ==164141== by 0x498E7FE: gethostbyaddr_r@@GLIBC_2.2.5 (getXXbyYY_r.c:274) ==164141== by 0x498E560: gethostbyaddr (getXXbyYY.c:135) ==164141== by 0x121353: INET_rresolve.constprop.0 (inet.c:212) ==164141== by 0x1135B9: INET_sprint (inet.c:261) ==164141== by 0x121BFC: addr_do_one.constprop.0.isra.0 (netstat.c:1156)
* | process-util: dont allocate max length to read /proc/PID/cmdlineAnita Zhang2021-03-242-25/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Alternative title: Replace get_process_cmdline()'s fopen()/fread() with read_full_virtual_file(). When RLIMIT_STACK is set to infinity:infinity, _SC_ARG_MAX will return 4611686018427387903 (depending on the system, but definitely something larger than most systems have). It's impractical to allocate this in one go when most cmdlines are much shorter than that. Instead use read_full_virtual_file() which seems to increase the buffer depending on the size of the contents.
* | pid1: do not use generated strings as format strings (#19098)Lincoln Ramsay2021-03-241-3/+3
| | | | | | | | | | | | | | The generated string may include %, which will confuse both the xprintf call, and the VA_FORMAT_ADVANCE macro. Pass the generated string as an argument to a "%s" format string instead.
* | network: fix ipv6 tunnel encapsulation limit (#19087)hide2021-03-241-1/+1
| | | | | | The encapsulation limit of IPv6 tunnel can not be set to 4, which is the default value of the encapsulation limit.