summaryrefslogtreecommitdiff
path: root/ident.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'jk/reset-ident-time-per-commit' into maintJunio C Hamano2016-08-121-0/+5
|\ | | | | | | | | | | | | | | | | | | | | Not-so-recent rewrite of "git am" that started making internal calls into the commit machinery had an unintended regression, in that no matter how many seconds it took to apply many patches, the resulting committer timestamp for the resulting commits were all the same. * jk/reset-ident-time-per-commit: am: reset cached ident date for each patch
| * am: reset cached ident date for each patchjk/reset-ident-time-per-commitJeff King2016-08-011-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we compute the date to go in author/committer lines of commits, or tagger lines of tags, we get the current date once and then cache it for the rest of the program. This is a good thing in some cases, like "git commit", because it means we do not racily assign different times to the author/committer fields of a single commit object. But as more programs start to make many commits in a single process (e.g., the recently builtin "git am"), it means that you'll get long strings of commits with identical committer timestamps (whereas before, we invoked "git commit" many times and got true timestamps). This patch addresses it by letting callers reset the cached time, which means they'll get a fresh time on their next call to git_committer_info() or git_author_info(). The first caller to do so is "git am", which resets the time for each patch it applies. It would be nice if we could just do this automatically before filling in the ident fields of commit and tag objects. Unfortunately, it's hard to know where a particular logical operation begins and ends. For instance, if commit_tree_extended() were to call reset_ident_date() before getting the committer/author ident, that doesn't quite work; sometimes the author info is passed in to us as a parameter, and it may or may not have come from a previous call to ident_default_date(). So in those cases, we lose the property that the committer and the author timestamp always match. You could similarly put a date-reset at the end of commit_tree_extended(). That actually works in the current code base, but it's fragile. It makes the assumption that after commit_tree_extended() finishes, the caller has no other operations that would logically want to fall into the same timestamp. So instead we provide the tool to easily do the reset, and let the high-level callers use it to annotate their own logical operations. There's no automated test, because it would be inherently racy (it depends on whether the program takes multiple seconds to run). But you can see the effect with something like: # make a fake 100-patch series top=$(git rev-parse HEAD) bottom=$(git rev-list --first-parent -100 HEAD | tail -n 1) git log --format=email --reverse --first-parent \ --binary -m -p $bottom..$top >patch # now apply it; this presumably takes multiple seconds git checkout --detach $bottom git am <patch # now count the number of distinct committer times; # prior to this patch, there would only be one, but # now we'd typically see several. git log --format=%ct $bottom.. | sort -u Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Helped-by: Paul Tan <pyokagan@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * Merge branch 'da/user-useconfigonly' into HEADJunio C Hamano2016-05-181-6/+10
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "user.useConfigOnly" configuration variable makes it an error if users do not explicitly set user.name and user.email. However, its check was not done early enough and allowed another error to trigger, reporting that the default value we guessed from the system setting was unusable. This was a suboptimal end-user experience as we want the users to set user.name/user.email without relying on the auto-detection at all. * da/user-useconfigonly: ident: give "please tell me" message upon useConfigOnly error ident: check for useConfigOnly before auto-detection of name/email
* | \ Merge branch 'nd/error-errno'Junio C Hamano2016-05-171-5/+3
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code for warning_errno/die_errno has been refactored and a new error_errno() reporting helper is introduced. * nd/error-errno: (41 commits) wrapper.c: use warning_errno() vcs-svn: use error_errno() upload-pack.c: use error_errno() unpack-trees.c: use error_errno() transport-helper.c: use error_errno() sha1_file.c: use {error,die,warning}_errno() server-info.c: use error_errno() sequencer.c: use error_errno() run-command.c: use error_errno() rerere.c: use error_errno() and warning_errno() reachable.c: use error_errno() mailmap.c: use error_errno() ident.c: use warning_errno() http.c: use error_errno() and warning_errno() grep.c: use error_errno() gpg-interface.c: use error_errno() fast-import.c: use error_errno() entry.c: use error_errno() editor.c: use error_errno() diff-no-index.c: use error_errno() ...
| * | | ident.c: use warning_errno()Nguyễn Thái Ngọc Duy2016-05-091-5/+3
| |/ / | | | | | | | | | | | | Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'da/user-useconfigonly'Junio C Hamano2016-04-291-6/+10
|\ \ \ | |/ / |/| / | |/ | | | | | | | | | | | | | | | | | | | | The "user.useConfigOnly" configuration variable makes it an error if users do not explicitly set user.name and user.email. However, its check was not done early enough and allowed another error to trigger, reporting that the default value we guessed from the system setting was unusable. This was a suboptimal end-user experience as we want the users to set user.name/user.email without relying on the auto-detection at all. * da/user-useconfigonly: ident: give "please tell me" message upon useConfigOnly error ident: check for useConfigOnly before auto-detection of name/email
| * ident: give "please tell me" message upon useConfigOnly errorda/user-useconfigonlyMarios Titas2016-04-011-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The env_hint message applies perfectly to the case when user.useConfigOnly is set and at least one of the user.name and the user.email are not provided. Additionally, use a less descriptive error message to discourage users from disabling user.useConfigOnly configuration variable to work around this error condition. We want to encourage them to set user.name or user.email instead. Signed-off-by: Marios Titas <redneb@gmx.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * ident: check for useConfigOnly before auto-detection of name/emailMarios Titas2016-04-011-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | If user.useConfigOnly is set, it does not make sense to try to auto-detect the name and/or the email. The auto-detection may even result in a bogus name and trigger an error message. Check if the use-config-only is set and die if no explicit name was given, before attempting to auto-detect, to correct this. Signed-off-by: Marios Titas <redneb@gmx.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'da/user-useconfigonly'Junio C Hamano2016-02-171-22/+40
|\ \ | |/ | | | | | | | | | | | | | | | | | | The "user.useConfigOnly" configuration variable can be used to force the user to always set user.email & user.name configuration variables, serving as a reminder for those who work on multiple projects and do not want to put these in their $HOME/.gitconfig. * da/user-useconfigonly: ident: add user.useConfigOnly boolean for when ident shouldn't be guessed fmt_ident: refactor strictness checks
| * ident: add user.useConfigOnly boolean for when ident shouldn't be guessedDan Aloni2016-02-081-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It used to be that: git config --global user.email "(none)" was a viable way for people to force themselves to set user.email in each repository. This was helpful for people with more than one email address, targeting different email addresses for different clones, as it barred git from creating a commit unless the user.email config was set in the per-repo config to the correct email address. A recent change, 19ce497c (ident: keep a flag for bogus default_email, 2015-12-10), however, declared that an explicitly configured user.email is not bogus, no matter what its value is, so this hack no longer works. Provide the same functionality by adding a new configuration variable user.useConfigOnly; when this variable is set, the user must explicitly set user.email configuration. Signed-off-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Dan Aloni <alonid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * fmt_ident: refactor strictness checksJeff King2016-02-041-22/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function has evolved quite a bit over time, and as a result, the logic for "is this an OK ident" has been sprinkled throughout. This ends up with a lot of redundant conditionals, like checking want_name repeatedly. Worse, we want to know in many cases whether we are using the "default" ident, and we do so by comparing directly to the global strbuf, which violates the abstraction of the ident_default_* functions. Let's reorganize the function into a hierarchy of conditionals to handle similar cases together. The only case that doesn't just work naturally for this is that of an empty name, where our advice is different based on whether we came from ident_default_name() or not. We can use a simple flag to cover this case. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | ident.c: read /etc/mailname with strbuf_getline()Junio C Hamano2016-01-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Just in case /etc/mailname file was edited with a DOS editor, read it with strbuf_getline() so that a stray CR is not included as the last character of the mail hostname. We _might_ want to more aggressively discard whitespace characters around the line with strbuf_trim(), but that is a bit outside the scope of this series. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | strbuf: introduce strbuf_getline_{lf,nul}()Junio C Hamano2016-01-151-1/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The strbuf_getline() interface allows a byte other than LF or NUL as the line terminator, but this is only because I wrote these codepaths anticipating that there might be a value other than NUL and LF that could be useful when I introduced line_termination long time ago. No useful caller that uses other value has emerged. By now, it is clear that the interface is overly broad without a good reason. Many codepaths have hardcoded preference to read either LF terminated or NUL terminated records from their input, and then call strbuf_getline() with LF or NUL as the third parameter. This step introduces two thin wrappers around strbuf_getline(), namely, strbuf_getline_lf() and strbuf_getline_nul(), and mechanically rewrites these call sites to call either one of them. The changes contained in this patch are: * introduction of these two functions in strbuf.[ch] * mechanical conversion of all callers to strbuf_getline() with either '\n' or '\0' as the third parameter to instead call the respective thin wrapper. After this step, output from "git grep 'strbuf_getline('" would become a lot smaller. An interim goal of this series is to make this an empty set, so that we can have strbuf_getline_crlf() take over the shorter name strbuf_getline(). Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'jk/ident-loosen-getpwuid'Junio C Hamano2015-12-211-9/+40
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | When getpwuid() on the system returned NULL (e.g. the user is not in the /etc/passwd file or other uid-to-name mappings), the codepath to find who the user is to record it in the reflog barfed and died. Loosen the check in this codepath, which already accepts questionable ident string (e.g. host part of the e-mail address is obviously bogus), and in general when we operate fmt_ident() function in non-strict mode. * jk/ident-loosen-getpwuid: ident: loosen getpwuid error in non-strict mode ident: keep a flag for bogus default_email ident: make xgetpwuid_self() a static local helper
| * ident: loosen getpwuid error in non-strict modejk/ident-loosen-getpwuidJeff King2015-12-141-8/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the user has not specified an identity and we have to turn to getpwuid() to find the username or gecos field, we die immediately when getpwuid fails (e.g., because the user does not exist). This is OK for making a commit, where we have set IDENT_STRICT and would want to bail on bogus input. But for something like a reflog, where the ident is "best effort", it can be pain. For instance, even running "git clone" with a UID that is not in /etc/passwd will result in git barfing, just because we can't find an ident to put in the reflog. Instead of dying in xgetpwuid_self, we can instead return a fallback value, and set a "bogus" flag. For the username in an email, we already have a "default_email_is_bogus" flag. For the name field, we introduce (and check) a matching "default_name_is_bogus" flag. As a bonus, this means you now get the usual "tell me who you are" advice instead of just a "no such user" error. No tests, as this is dependent on configuration outside of git's control. However, I did confirm that it behaves sensibly when I delete myself from the local /etc/passwd (reflogs get written, and commits complain). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * ident: keep a flag for bogus default_emailJeff King2015-12-101-7/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we have to deduce the user's email address and can't come up with something plausible for the hostname, we simply write "(none)" or ".(none)" in the hostname. Later, our strict-check is forced to use strstr to look for this magic string. This is probably not a problem in practice, but it's rather ugly. Let's keep an extra flag that tells us the email is bogus, and check that instead. We could get away with simply setting the global in add_domainname(); it only gets called to write into git_default_email. However, let's make the code a little more obvious to future readers by actually passing a pointer to our "bogus" flag down the call-chain. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * ident: make xgetpwuid_self() a static local helperJeff King2015-12-101-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | This function is defined in wrapper.c, but nobody besides ident.c uses it. And nobody is likely to in the future, either, as anything that cares about the user's name should be going through the ident code. Moving it here is a cleanup of the global namespace, but it will also enable further cleanups inside ident.c. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | ident: fix undefined variable when NO_IPV6 is setep/ident-with-getaddrinfoJeff King2015-12-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Commit 00bce77 (ident.c: add support for IPv6, 2015-11-27) moved the "gethostbyname" call out of "add_domainname" and into the helper function "canonical_name". But when moving the code, it forgot that the "buf" variable is passed as "host" in the helper. Reported-by: johan defries <johandefries@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | ident.c: add support for IPv6Elia Pinto2015-11-281-4/+27
|/ | | | | | | | | | | Add IPv6 support by implementing name resolution with the protocol agnostic getaddrinfo(3) API. The old gethostbyname(3) code is still available when git is compiled with NO_IPV6. Signed-off-by: Elia Pinto <gitter.spiros@gmail.com> Helped-by: Jeff King <peff@peff.net> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Jeff King <peff@peff.net>
* Merge branch 'jk/commit-author-parsing'Junio C Hamano2014-09-191-15/+11
|\ | | | | | | | | | | | | | | | | | | | | | | Code clean-up. * jk/commit-author-parsing: determine_author_info(): copy getenv output determine_author_info(): reuse parsing functions date: use strbufs in date-formatting functions record_author_date(): use find_commit_header() record_author_date(): fix memory leak on malformed commit commit: provide a function to find a header in a buffer
| * date: use strbufs in date-formatting functionsJeff King2014-08-271-15/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Many of the date functions write into fixed-size buffers. This is a minor pain, as we have to take special precautions, and frequently end up copying the result into a strbuf or heap-allocated buffer anyway (for which we sometimes use strcpy!). Let's instead teach parse_date, datestamp, etc to write to a strbuf. The obvious downside is that we might need to perform a heap allocation where we otherwise would not need to. However, it turns out that the only two new allocations required are: 1. In test-date.c, where we don't care about efficiency. 2. In determine_author_info, which is not performance critical (and where the use of a strbuf will help later refactoring). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | config --global --edit: create a template file if neededMatthieu Moy2014-07-251-1/+1
|/ | | | | | | | | | | | | | | | | | When the user has no ~/.gitconfig file, git config --global --edit used to launch an editor on an nonexistant file name. Instead, create a file with a default content before launching the editor. The template contains only commented-out entries, to save a few keystrokes for the user. If the values are guessed properly, the user will only have to uncomment the entries. Advanced users teaching newbies can create a minimalistic configuration faster for newbies. Beginners reading a tutorial advising to run "git config --global --edit" as a first step will be slightly more guided for their first contact with Git. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'jk/split-broken-ident'Junio C Hamano2013-10-281-1/+15
|\ | | | | | | | | | | | | | | Make the fall-back parsing of commit objects with broken author or committer lines more robust to pick up the timestamps. * jk/split-broken-ident: split_ident: parse timestamp from end of line
| * split_ident: parse timestamp from end of linejk/split-broken-identJeff King2013-10-151-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split_ident currently parses left to right. Given this input: Your Name <email@example.com> 123456789 -0500\n We assume the name starts the line and runs until the first "<". That starts the email address, which runs until the first ">". Everything after that is assumed to be the timestamp. This works fine in the normal case, but is easily broken by corrupted ident lines that contain an extra ">". Some examples seen in the wild are: 1. Name <email>-<> 123456789 -0500\n 2. Name <email> <Name<email>> 123456789 -0500\n 3. Name1 <email1>, Name2 <email2> 123456789 -0500\n Currently each of these produces some email address (which is not necessarily the one the user intended) and end up with a NULL date (which is generally interpreted as the epoch by "git log" and friends). But in each case we could get the correct timestamp simply by parsing from the right-hand side, looking backwards for the final ">", and then reading the timestamp from there. In general, it's a losing battle to try to automatically guess what the user meant with their broken crud. But this particular workaround is probably worth doing. One, it's dirt simple, and can't impact non-broken cases. Two, it doesn't catch a single breakage we've seen, but rather a large class of errors (i.e., any breakage inside the email angle brackets may affect the email, but won't spill over into the timestamp parsing). And three, the timestamp is arguably more valuable to get right, because it can affect correctness (e.g., in --until cutoffs). This patch implements the right-to-left scheme described above. We adjust the tests in t4212, which generate a commit with such a broken ident, and now gets the timestamp right. We also add a test that fsck continues to detect the breakage. For reference, here are pointers to the breakages seen (as numbered above): [1] http://article.gmane.org/gmane.comp.version-control.git/221441 [2] http://article.gmane.org/gmane.comp.version-control.git/222362 [3] http://perl5.git.perl.org/perl.git/commit/13b79730adea97e660de84bbe67f9d7cbe344302 Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | format-patch: print in-body "From" only when neededjk/format-patch-fromJeff King2013-09-201-0/+29
|/ | | | | | | | | | | | | | | | | | | | Commit a908047 taught format-patch the "--from" option, which places the author ident into an in-body from header, and uses the committer ident in the rfc822 from header. The documentation claims that it will omit the in-body header when it is the same as the rfc822 header, but the code never implemented that behavior. This patch completes the feature by comparing the two idents and doing nothing when they are the same (this is the same as simply omitting the in-body header, as the two are by definition indistinguishable in this case). This makes it reasonable to turn on "--from" all the time (if it matches your particular workflow), rather than only using it when exporting other people's patches. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'jn/do-not-drop-username-when-reading-from-etc-mailname'Junio C Hamano2013-02-011-1/+5
|\ | | | | | | | | | | | | | | | | We used to stuff "user@" and then append what we read from /etc/mailname to come up with a default e-mail ident, but a bug lost the "user@" part. This is to fix it. * jn/do-not-drop-username-when-reading-from-etc-mailname: ident: do not drop username when reading from /etc/mailname
| * ident: do not drop username when reading from /etc/mailnameJonathan Nieder2013-01-251-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An earlier conversion from fgets() to strbuf_getline() in the codepath to read from /etc/mailname to learn the default host-part of the ident e-mail address forgot that strbuf_getline() stores the line at the beginning of the buffer just like fgets(). The "username@" the caller has prepared in the strbuf, expecting the function to append the host-part to it, was lost because of this. Reported-by: Mihai Rusu <dizzy@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | ident: keep separate "explicit" flags for author and committerJeff King2012-11-151-7/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We keep track of whether the user ident was given to us explicitly, or if we guessed at it from system parameters like username and hostname. However, we kept only a single variable. This covers the common cases (because the author and committer will usually come from the same explicit source), but can miss two cases: 1. GIT_COMMITTER_* is set explicitly, but we fallback for GIT_AUTHOR. We claim the ident is explicit, even though the author is not. 2. GIT_AUTHOR_* is set and we ask for author ident, but not committer ident. We will claim the ident is implicit, even though it is explicit. This patch uses two variables instead of one, updates both when we set the "fallback" values, and updates them individually when we read from the environment. Rather than keep user_ident_sufficiently_given as a compatibility wrapper, we update the only two callers to check the committer_ident, which matches their intent and what was happening already. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | ident: make user_ident_explicitly_given staticJeff King2012-11-151-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In v1.5.6-rc0~56^2 (2008-05-04) "user_ident_explicitly_given" was introduced as a global for communication between config, ident, and builtin-commit. In v1.7.0-rc0~72^2 (2010-01-07) readers switched to using the common wrapper user_ident_sufficiently_given(). After v1.7.11-rc1~15^2~18 (2012-05-21), the var is only written in ident.c. Now we can make it static, which will enable further refactoring without worrying about upsetting other code. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | ident.c: mark private file-scope symbols as staticJunio C Hamano2012-09-151-2/+2
|/ | | | Signed-off-by: Junio C Hamano <gitster@pobox.com>
* split_ident_line(): make best effort when parsing author/committer lineJunio C Hamano2012-08-311-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | Commits made by ancient version of Git allowed committer without human readable name, like this (00213b17c in the kernel history): tree 6947dba41f8b0e7fe7bccd41a4840d6de6a27079 parent 352dd1df32e672be4cff71132eb9c06a257872fe author Petr Baudis <pasky@ucw.cz> 1135223044 +0100 committer <sam@mars.ravnborg.org> 1136151043 +0100 kconfig: Remove support for lxdialog --checklist ... Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Sam Ravnborg <sam@ravnborg.org> When fed such a commit, --format='%ci' fails to parse it, and gives back an empty string. Update the split_ident_line() to be a bit more lenient when parsing, but make sure the caller that wants to pick up sane value from its return value does its own validation. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'jk/ident-gecos-strbuf'Junio C Hamano2012-05-291-127/+116
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes quite a lot of brokenness when ident information needs to be taken from the system and cleans up the code. By Jeff King * jk/ident-gecos-strbuf: (22 commits) format-patch: do not use bogus email addresses in message ids ident: reject bogus email addresses with IDENT_STRICT ident: rename IDENT_ERROR_ON_NO_NAME to IDENT_STRICT format-patch: use GIT_COMMITTER_EMAIL in message ids ident: let callers omit name with fmt_indent ident: refactor NO_DATE flag in fmt_ident ident: reword empty ident error message format-patch: refactor get_patch_filename ident: trim whitespace from default name/email ident: use a dynamic strbuf in fmt_ident ident: use full dns names to generate email addresses ident: report passwd errors with a more friendly message drop length limitations on gecos-derived names and emails ident: don't write fallback username into git_default_name fmt_ident: drop IDENT_WARN_ON_NO_NAME code format-patch: use default email for generating message ids ident: trim trailing newline from /etc/mailname move git_default_* variables to ident.c move identity config parsing to ident.c fmt-merge-msg: don't use static buffer in record_person ...
| * ident: reject bogus email addresses with IDENT_STRICTJeff King2012-05-241-0/+6
| | | | | | | | | | | | | | | | | | | | If we come up with a hostname like "foo.(none)" because the user's machine is not fully qualified, we should reject this in strict mode (e.g., when we are making a commit object), just as we reject an empty gecos username. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * ident: rename IDENT_ERROR_ON_NO_NAME to IDENT_STRICTJeff King2012-05-241-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Callers who ask for ERROR_ON_NO_NAME are not so much concerned that the name will be blank (because, after all, we will fall back to using the username), but rather it is a check to make sure that low-quality identities do not end up in things like commit messages or emails (whereas it is OK for them to end up in things like reflogs). When future commits add more quality checks on the identity, each of these callers would want to use those checks, too. Rather than modify each of them later to add a new flag, let's refactor the flag. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * ident: let callers omit name with fmt_indentJeff King2012-05-241-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most callers want to see all of "$name <$email> $date", but a few want only limited parts, omitting the date, or even the name. We already have IDENT_NO_DATE to handle the date part, but there's not a good option for getting just the email. Callers have to done one of: 1. Call ident_default_email; this does not respect environment variables, nor does it promise to trim whitespace or other crud from the result. 2. Call git_{committer,author}_info; this returns the name and email, leaving the caller to parse out the wanted bits. This patch adds IDENT_NO_NAME; it stops short of adding IDENT_NO_EMAIL, as no callers want it (nor are likely to), and it complicates the error handling of the function. When no name is requested, the angle brackets (<>) around the email address are also omitted. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * ident: refactor NO_DATE flag in fmt_identJeff King2012-05-241-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As a short-hand, we extract this flag into the local variable "name_addr_only". It's more accurate to simply negate this and refer to it as "want_date", which will be less confusing when we add more NO_* flags. While we're touching this part of the code, let's move the call to ident_default_date() only when we are actually going to use it, not when we have NO_DATE set, or when we get a date from the environment. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * ident: reword empty ident error messageJeff King2012-05-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | There's on point in printing the name, since it is by definition the empty string if we have reached this code path. Instead, let's be more clear that we are complaining about the empty name, but still show the email address that it is attached to (since that may provide some context to the user). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * ident: trim whitespace from default name/emailJeff King2012-05-221-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Usually these values get fed to fmt_ident, which will trim any cruft anyway, but there are a few code paths which use them directly. Let's clean them up for the benefit of those callers. Furthermore, fmt_ident will look at the pre-trimmed value and decide whether to invoke ERROR_ON_NO_NAME; this check can be fooled by a name consisting only of spaces. Note that we only bother to clean up when we are pulling the information from gecos or from system files. Any other value comes from a config file, where we will have cleaned up accidental whitespace already. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * ident: use a dynamic strbuf in fmt_identJeff King2012-05-221-28/+15
| | | | | | | | | | | | | | | | | | | | | | | | Now that we accept arbitrary-sized names and email addresses, the only remaining limit is in the actual formatting of the names into a buffer. The current limit is 1000 characters, which is not likely to be reached, but using a strbuf is one less error condition we have to worry about. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * ident: use full dns names to generate email addressesJeff King2012-05-221-9/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we construct an email address from the username and hostname, we generate the host part of the email with this procedure: 1. add the result of gethostname 2. if it has a dot, ok, it's fully qualified 3. if not, then look up the unqualified hostname via gethostbyname; take the domain name of the result and append it to the hostname Step 3 can actually produce a bogus result, as the name returned by gethostbyname may not be related to the hostname we fed it (e.g., consider a machine "foo" with names "foo.one.example.com" and "bar.two.example.com"; we may have the latter returned and generate the bogus name "foo.two.example.com"). This patch simply uses the full hostname returned by gethostbyname. In the common case that the first part is the same as the unqualified hostname, the behavior is identical. And in the case that it is not the same, we are much more likely to be generating a valid name. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * ident: report passwd errors with a more friendly messageJeff King2012-05-221-15/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When getpwuid fails, we give a cute but cryptic message. While it makes sense if you know that getpwuid or identity functions are being called, this code is triggered behind the scenes by quite a few git commands these days (e.g., receive-pack on a remote server might use it for a reflog; the current message is hard to distinguish from an authentication error). Let's switch to something that gives a little more context. While we're at it, we can factor out all of the cut-and-pastes of the "you don't exist" message into a wrapper function. Rather than provide xgetpwuid, let's make it even more specific to just getting the passwd entry for the current uid. That's the only way we use getpwuid anyway, and it lets us make an even more specific error message. The current message also fails to mention errno. While the usual cause for getpwuid failing is that the user does not exist, mentioning errno makes it easier to diagnose these problems. Note that POSIX specifies that errno remain untouched if the passwd entry does not exist (but will be set on actual errors), whereas some systems will return ENOENT or similar for a missing entry. We handle both cases in our wrapper. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * drop length limitations on gecos-derived names and emailsJeff King2012-05-221-65/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we pull the user's name from the GECOS field of the passwd file (or generate an email address based on their username and hostname), we put the result into a static buffer. While it's extremely unlikely that anybody ever hit these limits (after all, in such a case their parents must have hated them), we still had to deal with the error cases in our code. Converting these static buffers to strbufs lets us simplify the code and drop some error messages from the documentation that have confused some users. The conversion is mostly mechanical: replace string copies with strbuf equivalents, and access the strbuf.buf directly. There are a few exceptions: - copy_gecos and copy_email are the big winners in code reduction (since they no longer have to manage the string length manually) - git_ident_config wants to replace old versions of the default name (e.g., if we read the config multiple times), so it must reset+add to the strbuf instead of just adding Note that there is still one length limitation: the gethostname interface requires us to provide a static buffer, so we arbitrarily choose 1024 bytes for the hostname. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * ident: don't write fallback username into git_default_nameJeff King2012-05-221-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fmt_ident function gets a flag that tells us whether to die if the name field is blank. If it is blank and we don't die, then we fall back to the username from the passwd file. The current code writes the value into git_default_name. However, that's not necessarily correct, as the empty value might have come from git_default_name, or it might have been passed in. This leads to two potential problems: 1. If we are overriding an empty name in the passed-in value, then we may be overwriting a perfectly good name (from gitconfig or gecos) in the git_default_name buffer. Later calls to fmt_ident will end up using the fallback name, even though a better name was available. 2. If we override an empty gecos name, we end up with the fallback name in git_default_name. A later call that uses IDENT_ERROR_ON_NO_NAME will see the fallback name and think that it is a good name, instead of producing an error. In other words, a blank gecos name would cause an error with this code: git_committer_info(IDENT_ERROR_ON_NO_NAME); but not this: git_committer_info(0); git_committer_info(IDENT_ERROR_ON_NO_NAME); because in the latter case, the first call has polluted the name buffer. Instead, let's make the fallback a per-invocation variable. We can just use the pw->pw_name string directly, since it only needs to persist through the rest of the function (and we don't do any other getpwent calls). Note that while this solves (1) for future invocations of fmt_indent, the current invocation might use the fallback when it could in theory load a better value from git_default_name. However, by not passing IDENT_ERROR_ON_NO_NAME, the caller is indicating that it does not care too much about the name, anyway, so we don't bother; this is primarily about protecting future callers who do care. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * fmt_ident: drop IDENT_WARN_ON_NO_NAME codeJeff King2012-05-221-7/+4
| | | | | | | | | | | | | | There are no more callers who want this, so we can drop it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * ident: trim trailing newline from /etc/mailnameJeff King2012-05-221-0/+4
| | | | | | | | | | | | | | | | | | | | | | We use fgets to read the /etc/mailname file, which means we will typically end up with an extra newline in our git_default_email. Most of the time this doesn't matter, as fmt_ident will skip it as cruft, but there is one code path that accesses it directly (in http-push.c:lock_remote). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * move git_default_* variables to ident.cJeff King2012-05-221-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | There's no reason anybody outside of ident.c should access these directly (they should use the new accessors which make sure the variables are initialized), so we can make them file-scope statics. While we're at it, move user_ident_explicitly_given into ident.c; while still globally visible, it makes more sense to reside with the ident code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * move identity config parsing to ident.cJeff King2012-05-221-0/+21
| | | | | | | | | | | | | | | | | | There's no reason for this to be in config, except that once upon a time all of the config parsing was there. It makes more sense to keep the ident code together. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * ident: split setup_ident into separate functionsJeff King2012-05-221-16/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function sets up the default name, email, and date, and is not publicly available. Let's split it into three public functions so that callers can get just the parts they need. While we're at it, let's change the interface to simple accessors. The original function was called only by fmt_ident, and contained logic for "if we already have some other value, don't load the default" which properly belongs in fmt_ident. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | fix off-by-one error in split_ident_lineJeff King2012-05-221-1/+1
|/ | | | | | | | | | | | | | Commit 4b340cf split the logic to parse an ident line out of pretty.c's format_person_part. But in doing so, it accidentally introduced an off-by-one error that caused it to think that single-character names were invalid. This manifested itself as the "%an" format failing to show anything at all for a single-character name. Reported-by: Brian Turner <bturner@atlassian.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ident.c: add split_ident_line() to parse formatted ident lineJunio C Hamano2012-03-111-0/+68
| | | | | | | | | | | | | The commit formatting logic format_person_part() in pretty.c implements the logic to split an author/committer ident line into its parts, intermixed with logic to compute its output using these piece it computes. Separate the former out to a helper function split_ident_line() so that other codepath can use the same logic, and rewrite the function using the helper function. Signed-off-by: Junio C Hamano <gitster@pobox.com>