- Databases are one of the most important parts of Forgejo, every
interaction uses the database in one way or another. Therefore, it is
important to maintain the database and recognize when the server is not
doing well with the database. There already is the option to log *every*
SQL query along with its execution time, but monitoring becomes
impractical for larger instances and takes up unnecessary storage in the
logs.
- Add a QoL enhancement that allows instance administrators to specify a
threshold value beyond which query execution time is logged as a warning
in the xorm logger. The default value is a conservative five seconds to
avoid this becoming a source of spam in the logs.
- The use case for this patch is that with an instance the size of
Codeberg, monitoring SQL logs is not very fruitful and most of them are
uninteresting. Recently, in the context of persistent deadlock issues
(https://codeberg.org/forgejo/forgejo/issues/220), I have noticed that
certain queries hold locks on tables like comment and issue for several
seconds. This patch helps to identify which queries these are and when
they happen.
- Added unit test.
(cherry picked from commit 9cf501f1af4cd870221cef6af489618785b71186)
---------
Co-authored-by: Gusted <postmaster@gusted.xyz>
Co-authored-by: Giteabot <teabot@gitea.io>
Co-authored-by: 6543 <6543@obermui.de>
Fix#28843
This PR will bypass the pushUpdateTag to database failure when
syncAllTags. An error log will be recorded.
---------
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
Fix#29175
Replace #29207
This PR makes some improvements to the `issue_comment` workflow trigger
event.
1. Fix the bug that pull requests cannot trigger `issue_comment`
workflows
2. Previously the `issue_comment` event only supported the `created`
activity type. This PR adds support for the missing `edited` and
`deleted` activity types.
3. Some events (including `issue_comment`, `issues`, etc. ) only trigger
workflows that belong to the workflow file on the default branch. This
PR introduces the `IsDefaultBranchWorkflow` function to check for these
events.
Fixes#29101
Related #29298
Discard all read data to prevent misinterpreting existing data. Some
discard calls were missing in error cases.
---------
Co-authored-by: yp05327 <576951401@qq.com>
Fixes the reason why #29101 is hard to replicate.
Related #29297
Create a repo with a file with minimum size 4097 bytes (I use 10000) and
execute the following code:
```go
gitRepo, err := gitrepo.OpenRepository(db.DefaultContext, <repo>)
assert.NoError(t, err)
commit, err := gitRepo.GetCommit(<sha>)
assert.NoError(t, err)
entry, err := commit.GetTreeEntryByPath(<file>)
assert.NoError(t, err)
b := entry.Blob()
// Create a reader
r, err := b.DataAsync()
assert.NoError(t, err)
defer r.Close()
// Create a second reader
r2, err := b.DataAsync()
assert.NoError(t, err) // Should be no error but is ErrNotExist
defer r2.Close()
```
The problem is the check in `CatFileBatch`:
79217ea63c/modules/git/repo_base_nogogit.go (L81-L87)
`Buffered() > 0` is used to check if there is a "operation" in progress
at the moment. This is a problem because we can't control the internal
buffer in the `bufio.Reader`. The code above demonstrates a sequence
which initiates an operation for which the code thinks there is no
active processing. The second call to `DataAsync()` therefore reuses the
existing instances instead of creating a new batch reader.
Follow #29024
Major changes:
* refactor validLinksPattern to fullURLPattern and add comments, now it
accepts "protocol:" prefix
* rename `IsLink*` to `IsFullURL*`, and remove unnecessray "mailto:"
check
* fix some comments (by the way)
* rename EmojiShortCodeRegex -> emojiShortCodeRegex (by the way)
Fix#29166
Add support for the following activity types of `pull_request`
- assigned
- unassigned
- review_requested
- review_request_removed
- milestoned
- demilestoned
Follow #29165.
* Introduce JSONTemplate to help to render JSON templates
* Introduce JSEscapeSafe for templates. Now only use `{{ ... |
JSEscape}}` instead of `{{ ... | JSEscape | Safe}}`
* Simplify "UserLocationMapURL" useage
Old code is not consistent for generating & decoding the JWT secrets.
Now, the callers only need to use 2 consistent functions:
NewJwtSecretWithBase64 and DecodeJwtSecretBase64
And remove a non-common function Base64FixedDecode from util.go
Clarify when "string" should be used (and be escaped), and when
"template.HTML" should be used (no need to escape)
And help PRs like #29059 , to render the error messages correctly.
Introduce a new function checkGitVersionCompatibility, when the git
version can't be used by Gitea, tell the end users to downgrade or
upgrade. The refactored functions are related to make the code easier to
test.
And simplify the comments for "safe.directory"
---------
Co-authored-by: delvh <dev.lh@web.de>
With this option, it is possible to require a linear commit history with
the following benefits over the next best option `Rebase+fast-forward`:
The original commits continue existing, with the original signatures
continuing to stay valid instead of being rewritten, there is no merge
commit, and reverting commits becomes easier.
Closes#24906
Replace #28849. Thanks to @yp05327 for the looking into the problem.
Fix#28840
The old behavior of newSignatureFromCommitline is not right. The new
parseSignatureFromCommitLine:
1. never fails
2. only accept one format (if there is any other, it could be easily added)
And add some tests.
Try to improve #28949
1. Make `ctx.Data["ShowOutdatedComments"] = true` by default: it brings
consistent user experience, and sometimes the "outdated (source
changed)" comments are still valuable.
2. Show a friendly message if the comment won't show, then the end users
won't fell that "the comment disappears" (it is the special case when
`ShowOutdatedComments = false`)
Fix for gitea putting everything into one request without batching and
sending it to Elasticsearch for indexing as issued in #28117
This issue occured in large repositories while Gitea tries to
index the code using ElasticSearch.
I've applied necessary changes that takes batch length from below config
(app.ini)
```
[queue.code_indexer]
BATCH_LENGTH=<length_int>
```
and batches all requests to Elasticsearch in chunks as configured in the
above config
Resolves https://github.com/go-gitea/gitea/issues/28704
Example of an entry in the generated `APKINDEX` file:
```
C:Q1xCO3H9LTTEbhKt9G1alSC87I56c=
P:hello
V:2.12-r1
A:x86_64
T:The GNU Hello program produces a familiar, friendly greeting
U:https://www.gnu.org/software/hello/
L:GPL-3.0-or-later
S:15403
I:36864
o:hello
m:
t:1705934118
D:so:libc.musl-x86_64.so.1
p:cmd:hello=2.12-r1
i:foobar=1.0 !baz
k:42
```
the `i:` and `k:` entries are new.
---------
Co-authored-by: KN4CK3R <admin@oldschoolhack.me>
Fixes#28660
Fixes an admin api bug related to `user.LoginSource`
Fixed `/user/emails` response not identical to GitHub api
This PR unifies the user update methods. The goal is to keep the logic
only at one place (having audit logs in mind). For example, do the
password checks only in one method not everywhere a password is updated.
After that PR is merged, the user creation should be next.
In #28691, schedule plans will be deleted when a repo's actions unit is
disabled. But when the unit is enabled, the schedule plans won't be
created again.
This PR fixes the bug. The schedule plans will be created again when the
actions unit is re-enabled
Renames it to `ENABLED` to be consistent with other settings and
deprecates it.
I believe this change is necessary because other setting groups such as
`attachment`, `cors`, `mailer`, etc. have an `ENABLED` setting, but
`oauth2` is the only one with an `ENABLE` setting, which could cause
confusion for users.
This is no longer a breaking change because `ENABLE` has been set as
deprecated and as an alias to `ENABLED`.
## Purpose
This is a refactor toward building an abstraction over managing git
repositories.
Afterwards, it does not matter anymore if they are stored on the local
disk or somewhere remote.
## What this PR changes
We used `git.OpenRepository` everywhere previously.
Now, we should split them into two distinct functions:
Firstly, there are temporary repositories which do not change:
```go
git.OpenRepository(ctx, diskPath)
```
Gitea managed repositories having a record in the database in the
`repository` table are moved into the new package `gitrepo`:
```go
gitrepo.OpenRepository(ctx, repo_model.Repo)
```
Why is `repo_model.Repository` the second parameter instead of file
path?
Because then we can easily adapt our repository storage strategy.
The repositories can be stored locally, however, they could just as well
be stored on a remote server.
## Further changes in other PRs
- A Git Command wrapper on package `gitrepo` could be created. i.e.
`NewCommand(ctx, repo_model.Repository, commands...)`. `git.RunOpts{Dir:
repo.RepoPath()}`, the directory should be empty before invoking this
method and it can be filled in the function only. #28940
- Remove the `RepoPath()`/`WikiPath()` functions to reduce the
possibility of mistakes.
---------
Co-authored-by: delvh <dev.lh@web.de>
The `ToUTF8*` functions were stripping BOM, while BOM is actually valid
in UTF8, so the stripping must be optional depending on use case. This
does:
- Add a options struct to all `ToUTF8*` functions, that by default will
strip BOM to preserve existing behaviour
- Remove `ToUTF8` function, it was dead code
- Rename `ToUTF8WithErr` to `ToUTF8`
- Preserve BOM in Monaco Editor
- Remove a unnecessary newline in the textarea value. Browsers did
ignore it, it seems but it's better not to rely on this behaviour.
Fixes: https://github.com/go-gitea/gitea/issues/28743
Related: https://github.com/go-gitea/gitea/issues/6716 which seems to
have once introduced a mechanism that strips and re-adds the BOM, but
from what I can tell, this mechanism was removed at some point after
that PR.
Fix#22066
# Purpose
This PR fix the releases will be deleted when mirror repository sync the
tags.
# The problem
In the previous implementation of #19125. All releases record in
databases of one mirror repository will be deleted before sync.
Ref:
https://github.com/go-gitea/gitea/pull/19125/files#diff-2aa04998a791c30e5a02b49a97c07fcd93d50e8b31640ce2ddb1afeebf605d02R481
# The Pros
This PR introduced a new method which will load all releases from
databases and all tags on git data into memory. And detect which tags
needs to be inserted, which tags need to be updated or deleted. Only
tags releases(IsTag=true) which are not included in git data will be
deleted, only tags which sha1 changed will be updated. So it will not
delete any real releases include drafts.
# The Cons
The drawback is the memory usage will be higher than before if there are
many tags on this repository. This PR defined a special release struct
to reduce columns loaded from database to memory.
This should fix https://github.com/go-gitea/gitea/issues/28927
Technically older versions of Git would support this flag as well, but
per https://github.com/go-gitea/gitea/pull/28466 that's the version
where using it (object-format=sha256) left "experimental" state.
`sha1` is (currently) the default, so older clients should be unaffected
in either case.
Signed-off-by: jolheiser <john.olheiser@gmail.com>
When LFS hooks are present in gitea-repositories, operations like git
push for creating a pull request fail. These repositories are not meant
to include LFS files or git push them, that is handled separately. And
so they should not have LFS hooks.
Installing git-lfs on some systems (like Debian Linux) will
automatically set up /etc/gitconfig to create LFS hooks in repositories.
For most git commands in Gitea this is not a problem, either because
they run on a temporary clone or the git command does not create LFS
hooks.
But one case where this happens is git archive for creating repository
archives. To fix that, add a GIT_CONFIG_NOSYSTEM=1 to disable using the
system configuration for that command.
According to a comment, GIT_CONFIG_NOSYSTEM is not used for all git
commands because the system configuration can be intentionally set up
for Gitea to use.
Resolves#19810, #21148
Sometimes you need to work on a feature which depends on another (unmerged) feature.
In this case, you may create a PR based on that feature instead of the main branch.
Currently, such PRs will be closed without the possibility to reopen in case the parent feature is merged and its branch is deleted.
Automatic target branch change make life a lot easier in such cases.
Github and Bitbucket behave in such way.
Example:
$PR_1$: main <- feature1
$PR_2$: feature1 <- feature2
Currently, merging $PR_1$ and deleting its branch leads to $PR_2$ being closed without the possibility to reopen.
This is both annoying and loses the review history when you open a new PR.
With this change, $PR_2$ will change its target branch to main ($PR_2$: main <- feature2) after $PR_1$ has been merged and its branch has been deleted.
This behavior is enabled by default but can be disabled.
For security reasons, this target branch change will not be executed when merging PRs targeting another repo.
Fixes#27062Fixes#18408
---------
Co-authored-by: Denys Konovalov <kontakt@denyskon.de>
Co-authored-by: delvh <dev.lh@web.de>
Fixes#26548
This PR refactors the rendering of markup links. The old code uses
`strings.Replace` to change some urls while the new code uses more
context to decide which link should be generated.
The added tests should ensure the same output for the old and new
behaviour (besides the bug).
We may need to refactor the rendering a bit more to make it clear how
the different helper methods render the input string. There are lots of
options (resolve links / images / mentions / git hashes / emojis / ...)
but you don't really know what helper uses which options. For example,
we currently support images in the user description which should not be
allowed I think:
<details>
<summary>Profile</summary>
https://try.gitea.io/KN4CK3R
![grafik](https://github.com/go-gitea/gitea/assets/1666336/109ae422-496d-4200-b52e-b3a528f553e5)
</details>
---------
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
Fixes#27114.
* In Gitea 1.12 (#9532), a "dismiss stale approvals" branch protection
setting was introduced, for ignoring stale reviews when verifying the
approval count of a pull request.
* In Gitea 1.14 (#12674), the "dismiss review" feature was added.
* This caused confusion with users (#25858), as "dismiss" now means 2
different things.
* In Gitea 1.20 (#25882), the behavior of the "dismiss stale approvals"
branch protection was modified to actually dismiss the stale review.
For some users this new behavior of dismissing the stale reviews is not
desirable.
So this PR reintroduces the old behavior as a new "ignore stale
approvals" branch protection setting.
---------
Co-authored-by: delvh <dev.lh@web.de>
Fix#28157
This PR fix the possible bugs about actions schedule.
## The Changes
- Move `UpdateRepositoryUnit` and `SetRepoDefaultBranch` from models to
service layer
- Remove schedules plan from database and cancel waiting & running
schedules tasks in this repository when actions unit has been disabled
or global disabled.
- Remove schedules plan from database and cancel waiting & running
schedules tasks in this repository when default branch changed.
Mainly for MySQL/MSSQL.
It is important for Gitea to use case-sensitive database charset
collation. If the database is using a case-insensitive collation, Gitea
will show startup error/warning messages, and show the errors/warnings
on the admin panel's Self-Check page.
Make `gitea doctor convert` work for MySQL to convert the collations of
database & tables & columns.
* Fix#28131
## ⚠️ BREAKING ⚠️
It is not quite breaking, but it's highly recommended to convert the
database&table&column to a consistent and case-sensitive collation.
In #26365 issue references were disabled entirely for documents,
intending to match GitHub behavior. However cross-references do appear
to work in documents on GitHub.
This is useful for example to write release notes in a markdown document
and reference issues. While the simpler syntax may create links when not
intended, hopefully the cross-reference syntax is unique enough to avoid
it.
The CORS code has been unmaintained for long time, and the behavior is
not correct.
This PR tries to improve it. The key point is written as comment in
code. And add more tests.
Fix#28515Fix#27642Fix#17098
Fix#28526, regression of
* #26365
(although the author of #26365 has recent activities, but there is no
response for the regression, so I proposed this quick fix and keep the
fix simple to make it easier to backport to 1.21)
Nowadays, cache will be used on almost everywhere of Gitea and it cannot
be disabled, otherwise some features will become unaviable.
Then I think we can just remove the option for cache enable. That means
cache cannot be disabled.
But of course, we can still use cache configuration to set how should
Gitea use the cache.
The 4 functions are duplicated, especially as interface methods. I think
we just need to keep `MustID` the only one and remove other 3.
```
MustID(b []byte) ObjectID
MustIDFromString(s string) ObjectID
NewID(b []byte) (ObjectID, error)
NewIDFromString(s string) (ObjectID, error)
```
Introduced the new interfrace method `ComputeHash` which will replace
the interface `HasherInterface`. Now we don't need to keep two
interfaces.
Reintroduced `git.NewIDFromString` and `git.MustIDFromString`. The new
function will detect the hash length to decide which objectformat of it.
If it's 40, then it's SHA1. If it's 64, then it's SHA256. This will be
right if the commitID is a full one. So the parameter should be always a
full commit id.
@AdamMajer Please review.
- Modify the `Password` field in `CreateUserOption` struct to remove the
`Required` tag
- Update the `v1_json.tmpl` template to include the `email` field and
remove the `password` field
---------
Signed-off-by: Bo-Yi Wu <appleboy.tw@gmail.com>
Update golang.org/x/crypto for CVE-2023-48795 and update other packages.
`go-git` is not updated because it needs time to figure out why some
tests fail.
- If a topic has zero repository count, it means that none of the
repositories are using that topic, that would make them 'useless' to
keep. One caveat is that if that topic is going to be used in the
future, it will be added again to the database, but simply with a new
ID.
Refs: https://codeberg.org/forgejo/forgejo/pulls/1964
Co-authored-by: Gusted <postmaster@gusted.xyz>
- Remove `ObjectFormatID`
- Remove function `ObjectFormatFromID`.
- Use `Sha1ObjectFormat` directly but not a pointer because it's an
empty struct.
- Store `ObjectFormatName` in `repository` struct
- When a repository is orphaned and has objects stored in any of the
storages such as repository avatar or attachments the delete function
would error, because the storage module wasn't initalized.
- Add code to initialize the storage module.
Refs: https://codeberg.org/forgejo/forgejo/pulls/1954
Co-authored-by: Gusted <postmaster@gusted.xyz>
Refactor Hash interfaces and centralize hash function. This will allow
easier introduction of different hash function later on.
This forms the "no-op" part of the SHA256 enablement patch.
## Changes
- Add deprecation warning to `Token` and `AccessToken` authentication
methods in swagger.
- Add deprecation warning header to API response. Example:
```
HTTP/1.1 200 OK
...
Warning: token and access_token API authentication is deprecated
...
```
- Add setting `DISABLE_QUERY_AUTH_TOKEN` to reject query string auth
tokens entirely. Default is `false`
## Next steps
- `DISABLE_QUERY_AUTH_TOKEN` should be true in a subsequent release and
the methods should be removed in swagger
- `DISABLE_QUERY_AUTH_TOKEN` should be removed and the implementation of
the auth methods in question should be removed
## Open questions
- Should there be further changes to the swagger documentation?
Deprecation is not yet supported for security definitions (coming in
[OpenAPI Spec version
3.2.0](https://github.com/OAI/OpenAPI-Specification/issues/2506))
- Should the API router logger sanitize urls that use `token` or
`access_token`? (This is obviously an insufficient solution on its own)
---------
Co-authored-by: delvh <dev.lh@web.de>
1. Do not sort the "checks" slice again and again when "Register", it
just wastes CPU when the Gitea instance runs
2. If a check doesn't exist, tell the end user
3. Add some tests
The function `GetByBean` has an obvious defect that when the fields are
empty values, it will be ignored. Then users will get a wrong result
which is possibly used to make a security problem.
To avoid the possibility, this PR removed function `GetByBean` and all
references.
And some new generic functions have been introduced to be used.
The recommand usage like below.
```go
// if query an object according id
obj, err := db.GetByID[Object](ctx, id)
// query with other conditions
obj, err := db.Get[Object](ctx, builder.Eq{"a": a, "b":b})
```
It will fix#28268 .
<img width="1313" alt="image"
src="https://github.com/go-gitea/gitea/assets/9418365/cb1e07d5-7a12-4691-a054-8278ba255bfc">
<img width="1318" alt="image"
src="https://github.com/go-gitea/gitea/assets/9418365/4fd60820-97f1-4c2c-a233-d3671a5039e9">
## ⚠️ BREAKING ⚠️
But need to give up some features:
<img width="1312" alt="image"
src="https://github.com/go-gitea/gitea/assets/9418365/281c0d51-0e7d-473f-bbed-216e2f645610">
However, such abandonment may fix#28055 .
## Backgroud
When the user switches the dashboard context to an org, it means they
want to search issues in the repos that belong to the org. However, when
they switch to themselves, it means all repos they can access because
they may have created an issue in a public repo that they don't own.
<img width="286" alt="image"
src="https://github.com/go-gitea/gitea/assets/9418365/182dcd5b-1c20-4725-93af-96e8dfae5b97">
It's a confusing design. Think about this: What does "In your
repositories" mean when the user switches to an org? Repos belong to the
user or the org?
Whatever, it has been broken by #26012 and its following PRs. After the
PR, it searches for issues in repos that the dashboard context user owns
or has been explicitly granted access to, so it causes #28268.
## How to fix it
It's not really difficult to fix it. Just extend the repo scope to
search issues when the dashboard context user is the doer. Since the
user may create issues or be mentioned in any public repo, we can just
set `AllPublic` to true, which is already supported by indexers. The DB
condition will also support it in this PR.
But the real difficulty is how to count the search results grouped by
repos. It's something like "search issues with this keyword and those
filters, and return the total number and the top results. **Then, group
all of them by repo and return the counts of each group.**"
<img width="314" alt="image"
src="https://github.com/go-gitea/gitea/assets/9418365/5206eb20-f8f5-49b9-b45a-1be2fcf679f4">
Before #26012, it was being done in the DB, but it caused the results to
be incomplete (see the description of #26012).
And to keep this, #26012 implement it in an inefficient way, just count
the issues by repo one by one, so it cannot work when `AllPublic` is
true because it's almost impossible to do this for all public repos.
1bfcdeef4c/modules/indexer/issues/indexer.go (L318-L338)
## Give up unnecessary features
We may can resovle `TODO: use "group by" of the indexer engines to
implement it`, I'm sure it can be done with Elasticsearch, but IIRC,
Bleve and Meilisearch don't support "group by".
And the real question is, does it worth it? Why should we need to know
the counts grouped by repos?
Let me show you my search dashboard on gitea.com.
<img width="1304" alt="image"
src="https://github.com/go-gitea/gitea/assets/9418365/2bca2d46-6c71-4de1-94cb-0c9af27c62ff">
I never think the long repo list helps anything.
And if we agree to abandon it, things will be much easier. That is this
PR.
## TODO
I know it's important to filter by repos when searching issues. However,
it shouldn't be the way we have it now. It could be implemented like
this.
<img width="1316" alt="image"
src="https://github.com/go-gitea/gitea/assets/9418365/99ee5f21-cbb5-4dfe-914d-cb796cb79fbe">
The indexers support it well now, but it requires some frontend work,
which I'm not good at. So, I think someone could help do that in another
PR and merge this one to fix the bug first.
Or please block this PR and help to complete it.
Finally, "Switch dashboard context" is also a design that needs
improvement. In my opinion, it can be accomplished by adding filtering
conditions instead of "switching".
The summary string ends up in the database, and (at least) MySQL &
PostgreSQL require valid UTF8 strings.
Fixes#28178
Co-authored-by: Darrin Smart <darrin@filmlight.ltd.uk>